Skip to content

funannotate predict fails with test and with real data: "Not enough gene models ... to train Augustus" #742

@IanDMedeiros

Description

@IanDMedeiros

Are you using the latest release?
funannotate v1.8.11.

Describe the bug
funannotate test is failing at the predict step with error "Not enough gene models 175 to train Augustus (200 required), exiting". Appears to be identical error to #552. I am also receiving similar errors with real data. End of #552 discussion suggested that the error might be related to GeneMark, but I am having troubled setting up GeneMark-ES so wouldn't the program just run without it?

What command did you issue?
funannotate test -t all --cpus 10

What probably isn't the problem, based on what I have tried so far
Bad Augustus installation. I was getting an Augustus error even earlier in funannotate test, so I replaced the Augustus that was installed by mamba with one (v. 3.3.3) already available on our system.
AUGUSTUS_CONFIG_PATH permissions. Ran chmod 777 $AUGUSTUS_CONFIG_PATH/species and error did not go away.
Multithreading. Tried with --cpus 1 and 10 ... same error.

Logfiles
funannotate-predict.log
`[06/29/22 21:19:42]: /hpc/home/idm7/miniconda3/envs/annotate/bin/funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --cpus 10 --species Awesome busco

[06/29/22 21:19:42]: OS: CentOS Stream 8, 46 cores, ~ 230 GB RAM. Python: 3.8.12
[06/29/22 21:19:42]: Running funannotate v1.8.11
[06/29/22 21:19:42]: GeneMark path: /hpc/group/bio1/ian/envs/funannotate/gmes_petap
[06/29/22 21:19:42]: Full path to gmes_petap.pl: /hpc/group/bio1/ian/envs/funannotate/gmes_petap/gmes_petap.pl
[06/29/22 21:19:42]: GeneMark appears to be functional? False
[06/29/22 21:19:43]: {'augustus': 1, 'hiq': 2, 'genemark': 0, 'pasa': 6, 'codingquarry': 0, 'snap': 1, 'glimmerhmm': 1, 'proteins': 1, 'transcripts': 1}
[06/29/22 21:19:43]: Skipping CodingQuarry as no --rna_bam passed
[06/29/22 21:19:43]: {'augustus': 'busco', 'snap': 'busco', 'glimmerhmm': 'busco'}
[06/29/22 21:19:43]: Parsed training data, run ab-initio gene predictors as follows:
[06/29/22 21:19:44]: {'augustus': 1, 'hiq': 2, 'genemark': 0, 'pasa': 6, 'codingquarry': 0, 'snap': 1, 'glimmerhmm': 1, 'proteins': 1, 'transcripts': 1}
[06/29/22 21:19:45]: Loading genome assembly and parsing soft-masked repetitive sequences
[06/29/22 21:19:45]: Genome loaded: 6 scaffolds; 3,776,588 bp; 19.75% repeats masked
[06/29/22 21:20:12]: join_mult_hints.pl
[06/29/22 21:20:12]: Running BUSCO to find conserved gene models for training ab-initio predictors
[06/29/22 21:20:12]: /hpc/home/idm7/miniconda3/envs/annotate/bin/python /hpc/home/idm7/miniconda3/envs/annotate/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/genome.softmasked.fa -m genome --lineage /hpc/group/bio1/ian/envs/funannotate_db/dikarya -o awesome_busco -c 10 --species anidulans -f --local_augustus /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/ab_initio_parameters/augustus
[06/29/22 21:25:12]: 175 valid BUSCO predictions found, validating protein sequences
[06/29/22 21:26:04]: 175 BUSCO predictions validated
[06/29/22 21:26:04]: Not enough gene models 175 to train Augustus (200 required), exiting
busco.logINFO ****************** Start a BUSCO 2.0 analysis, current time: 06/29/2022 21:20:12 ******************
INFO The lineage dataset is: dikarya_odb9 (eukaryota)
INFO Mode is: genome
INFO Maximum number of regions limited to: 3
INFO To reproduce this run: python /hpc/home/idm7/miniconda3/envs/annotate/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/genome.softmasked.fa -o awesome_busco -l /hpc/group/bio1/ian/envs/funannotate_db/dikarya/ -m genome -c 10 -sp anidulans
INFO Check dependencies...
INFO Check input file...
INFO Temp directory is ./tmp/

INFO ****** Phase 1 of 2, initial predictions ******
INFO ****** Step 1/3, current time: 06/29/2022 21:20:12 ******
INFO Create blast database...
INFO [makeblastdb] Building a new DB, current time: 06/29/2022 21:20:12
INFO [makeblastdb] New DB name: /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/tmp/awesome_busco_4188679581
INFO [makeblastdb] New DB title: /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/genome.softmasked.fa
INFO [makeblastdb] Sequence type: Nucleotide
INFO [makeblastdb] Keep Linkouts: T
INFO [makeblastdb] Keep MBits: T
INFO [makeblastdb] Maximum file size: 1000000000B
INFO [makeblastdb] Adding sequences from FASTA; added 6 sequences in 0.0434968 seconds.
INFO Running tblastn, writing output to /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/blast_output/tblastn_awesome_busco.tsv...
INFO ****** Step 2/3, current time: 06/29/2022 21:20:21 ******
INFO Getting coordinates for candidate regions...
INFO Pre-Augustus scaffold extraction...
INFO Running Augustus prediction using anidulans as species:
INFO [augustus] Please find all logs related to Augustus here: /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/augustus.log
INFO 06/29/2022 21:20:21 => 0% of predictions performed (743 to be done)
INFO 06/29/2022 21:20:56 => 10% of predictions performed (75/743 candidate regions)
INFO 06/29/2022 21:21:24 => 20% of predictions performed (149/743 candidate regions)
INFO 06/29/2022 21:22:04 => 30% of predictions performed (223/743 candidate regions)
INFO 06/29/2022 21:22:39 => 40% of predictions performed (298/743 candidate regions)
INFO 06/29/2022 21:23:01 => 50% of predictions performed (372/743 candidate regions)
INFO 06/29/2022 21:23:21 => 60% of predictions performed (446/743 candidate regions)
INFO 06/29/2022 21:23:38 => 70% of predictions performed (521/743 candidate regions)
INFO 06/29/2022 21:23:55 => 80% of predictions performed (596/743 candidate regions)
INFO 06/29/2022 21:24:08 => 90% of predictions performed (669/743 candidate regions)
INFO 06/29/2022 21:24:20 => 100% of predictions performed
INFO Extracting predicted proteins...
INFO ****** Step 3/3, current time: 06/29/2022 21:24:49 ******
INFO Running HMMER to confirm orthology of predicted proteins:
INFO 06/29/2022 21:24:49 => 0% of predictions performed (602 to be done)
INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092600SD.faa.1):
INFO [hmmersearch] Line 2: illegal character %
INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092603EH.faa.1):
INFO [hmmersearch] Line 2: illegal character %
INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092600T4.faa.1):
INFO [hmmersearch] Line 2: illegal character %
INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092600X0.faa.1):
INFO [hmmersearch] Line 2: illegal character %
INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG0926009O.faa.1):
INFO [hmmersearch] Line 2: illegal character %
INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092602I6.faa.3):
INFO [hmmersearch] Line 2: illegal character %

<This goes on for many lines, apparently through all the BUSCO loci. Omitting here for space.>

INFO 06/29/2022 21:24:58 => 100% of predictions performed
INFO Results:
INFO C:13.5%[S:13.3%,D:0.2%],F:0.1%,M:86.4%,n:1312
INFO 177 Complete BUSCOs (C)
INFO 175 Complete and single-copy BUSCOs (S)
INFO 2 Complete and duplicated BUSCOs (D)
INFO 1 Fragmented BUSCOs (F)
INFO 1134 Missing BUSCOs (M)
INFO 1312 Total BUSCO groups searched

INFO ****** Phase 2 of 2, predictions using species specific training ******
INFO ****** Step 1/3, current time: 06/29/2022 21:25:00 ******
INFO Extracting missing and fragmented buscos from the ancestral_variants file...
WARNING The busco id(s) ['EOG0926457R', 'EOG09264XJC', 'EOG0926129I', 'EOG09265F2Y', 'EOG09262N3C', 'EOG092602UY', 'EOG09264R4M', 'EOG09264P3R', 'EOG09264HX6', 'EOG09260OCI', 'EOG09261J0P', 'EOG092610TN', 'EOG09264BOA', 'EOG09261DRB', 'EOG092608L0', 'EOG09261FAX', 'EOG09264X8J', 'EOG09262R8O', 'EOG09262K67', 'EOG09260K5F', 'EOG09263LP3', 'EOG09264S3E', 'EOG09262FJB', 'EOG09260VEY', 'EOG09260TLW', 'EOG092608WI', 'EOG09261OXV', 'EOG09264PE5', 'EOG09261JWS', 'EOG09260NNR', 'EOG09264RJL', 'EOG09262KJA', 'EOG09260VYK', 'EOG092641UM', 'EOG092644N1', 'EOG09262V3O', 'EOG09260XMI', 'EOG09263BDA', 'EOG09264I6B', 'EOG092635ST', 'EOG0926071Q', 'EOG09264PK5', 'EOG09263D2P', 'EOG09260VHV', 'EOG09265RGS', 'EOG092603YJ', 'EOG092621ZV', 'EOG09261801', 'EOG09260ZWU', 'EOG09260PI1', 'EOG092607QZ', 'EOG09262W7C', 'EOG09264V3H', 'EOG09261UWT', 'EOG09264881', 'EOG09263E49', 'EOG09265KQ4', 'EOG09260AZM', 'EOG09264T3S', 'EOG09261TEQ', 'EOG09265BJ3', 'EOG0926522L', 'EOG09262CDO', 'EOG09262H34', 'EOG09264J8E', 'EOG09265FL1', 'EOG0926431P', 'EOG09263M8W', 'EOG09265FTY', 'EOG09262Z2S', 'EOG09264719', 'EOG092625AX', 'EOG09265HEP', 'EOG092618J9', 'EOG09260RVQ', 'EOG09263K45', 'EOG09264R1U', 'EOG09261ICI', 'EOG09263RVR', 'EOG09260WG2', 'EOG09263QUM', 'EOG09264ZWF', 'EOG092646WF', 'EOG09261OLD', 'EOG09263W48', 'EOG092632TF', 'EOG09265552', 'EOG09261D4D', 'EOG09264SET', 'EOG092627XA', 'EOG09262JRP', 'EOG09261P7G', 'EOG09262GNE', 'EOG092636T6', 'EOG092625P1', 'EOG092641M3', 'EOG09262POL', 'EOG09264Z3D', 'EOG09260K29', 'EOG092659DX', 'EOG09264G1I', 'EOG09260289', 'EOG09264C3N', 'EOG09262387', 'EOG09264HU0', 'EOG09264W7W', 'EOG09263WM5', 'EOG092629FB', 'EOG09260KM4', 'EOG092604A0', 'EOG09260FZZ', 'EOG09260GKG', 'EOG09262MJW', 'EOG09260XSR', 'EOG092621S9', 'EOG09261IEV', 'EOG09262TEV', 'EOG092641A6', 'EOG09263DQH', 'EOG09263YBT', 'EOG09263KVG', 'EOG092650VI', 'EOG092653O3', 'EOG09264441', 'EOG0926369X', 'EOG092643IE', 'EOG09261XZ6', 'EOG09264XUV', 'EOG092645OU', 'EOG09261I8J', 'EOG09263WWI', 'EOG09260NAN', 'EOG09260S2Z', 'EOG09264XYD', 'EOG0926484N', 'EOG09263FGN', 'EOG09260ETR', 'EOG0926506U', 'EOG09262KVB', 'EOG092605ZA', 'EOG0926248P', 'EOG092635DF', 'EOG092641K1', 'EOG0926315C', 'EOG092658QH', 'EOG09261JVS', 'EOG0926307V', 'EOG0926587S', 'EOG092604KQ', 'EOG09260J97', 'EOG09262HP3', 'EOG09264OQ8', 'EOG09263L7Y', 'EOG09261I0F', 'EOG09264ZDZ', 'EOG09262CXO', 'EOG09261I1I', 'EOG09261727', 'EOG09262BVA', 'EOG09265QTV', 'EOG092605VL', 'EOG09260KDB', 'EOG092617S2', 'EOG09262YP5', 'EOG0926407T', 'EOG092629RT', 'EOG092605OK', 'EOG09260EPS', 'EOG09265JNA', 'EOG09260DBG', 'EOG09260NZ8', 'EOG092621F2', 'EOG09261IOS', 'EOG0926539T', 'EOG09264W1U', 'EOG09260KNR', 'EOG09263PWF', 'EOG092610VI', 'EOG09264KDO', 'EOG09261G1Y', 'EOG09262IY3', 'EOG09261VD2', 'EOG09263KDI', 'EOG092658SK', 'EOG09265A08', 'EOG09263K05', 'EOG09263QPR', 'EOG092644WX', 'EOG092631ML', 'EOG09260KUC', 'EOG09262M0W', 'EOG092658NW', 'EOG09263XN3', 'EOG0926506Z', 'EOG09263U71', 'EOG09262TUR', 'EOG09265040', 'EOG092655IF', 'EOG09262E7I', 'EOG092641G3', 'EOG09261XNU', 'EOG09260EE7', 'EOG092645QN', 'EOG0926092K', 'EOG09263MR4', 'EOG09264XVU', 'EOG092610KH', 'EOG09261WJ8', 'EOG09261HZD', 'EOG09261SS1', 'EOG09261CQG', 'EOG0926273Q', 'EOG092619L1', 'EOG09265CCT', 'EOG09260KIY', 'EOG09262N5O', 'EOG092604ZZ', 'EOG09260R9L', 'EOG092654KW', 'EOG092615Y4', 'EOG09261CWO', 'EOG09260NXC', 'EOG09265G5K', 'EOG092612XD', 'EOG092605T6', 'EOG09261ZFN', 'EOG092620FM', 'EOG092646C6', 'EOG09264VC6', 'EOG092649VG', 'EOG09260LVD', 'EOG09265PWR', 'EOG09262PPU', 'EOG09262F22', 'EOG092615CC', 'EOG092616YZ', 'EOG09264RQY', 'EOG092616QN', 'EOG0926400M', 'EOG092648O6', 'EOG09264KO7', 'EOG09264NDD', 'EOG09262GWQ', 'EOG0926458I', 'EOG0926115V', 'EOG09265M98', 'EOG09260TVA', 'EOG09261RWJ', 'EOG09264A2D', 'EOG09260UA2', 'EOG092634MM', 'EOG09265IT6', 'EOG09263760', 'EOG092642UD', 'EOG092609O9', 'EOG09265FTN', 'EOG09265EKJ', 'EOG0926534P', 'EOG09263KZJ', 'EOG09261DG0', 'EOG09260NHN', 'EOG09262OX9', 'EOG09261T98', 'EOG09260WCZ', 'EOG09262HKC', 'EOG09263F11', 'EOG09261G92', 'EOG09262U7S', 'EOG09264VZ7', 'EOG092602I6', 'EOG09262E4T', 'EOG09262WQX', 'EOG09265HP0', 'EOG09264SSI', 'EOG09260FMW', 'EOG092612AK', 'EOG092600SD', 'EOG09261ACJ', 'EOG09260ZG2', 'EOG09263Y3L', 'EOG09261NLY', 'EOG092655SO', 'EOG092609RF', 'EOG09263CAC', 'EOG09261ABB', 'EOG09264272', 'EOG092651BA', 'EOG09265L8N', 'EOG09261OSU', 'EOG09262MEK', 'EOG09263UN3', 'EOG09260DP1', 'EOG09261AMX', 'EOG09262UAS', 'EOG09262SI7', 'EOG09263KRO', 'EOG09261TPN', 'EOG09260T4S', 'EOG092610QT', 'EOG09262X7T', 'EOG092629ZN', 'EOG092634B1', 'EOG092620EL', 'EOG0926009O', 'EOG09264G0H', 'EOG09262528', 'EOG09260QNB', 'EOG09261EM7', 'EOG092617RY', 'EOG092646CB', 'EOG09261O4Y', 'EOG09263G4R', 'EOG0926248W', 'EOG09260T28', 'EOG092624KK', 'EOG09263OD3', 'EOG09261Q18', 'EOG092658WY', 'EOG09265GSM', 'EOG09265B95', 'EOG092604I8', 'EOG09264FXB', 'EOG09264ZQC', 'EOG09264PI4', 'EOG09262VPD', 'EOG09262QS5', 'EOG09261ZPW', 'EOG09263ZBF', 'EOG09262YAU', 'EOG09262SMG', 'EOG092608ZS', 'EOG0926229Z', 'EOG09261YRA', 'EOG09263EQZ', 'EOG09260TWS', 'EOG09265OQH', 'EOG09263720', 'EOG092653NM', 'EOG09260AZK', 'EOG09261AH9', 'EOG09265B1X', 'EOG09263817', 'EOG0926112A', 'EOG092601KZ', 'EOG09264X31', 'EOG09264398', 'EOG09261N2L', 'EOG09262LI4', 'EOG0926074Y', 'EOG09260FPA', 'EOG09264MGU', 'EOG092626EQ', 'EOG09264U81', 'EOG09265FCK', 'EOG09260BFE', 'EOG09264CA0', 'EOG092603EH', 'EOG092653VU', 'EOG09262NB1', 'EOG092619MJ', 'EOG09260CKC', 'EOG09261DHR', 'EOG09262TO9', 'EOG092625U6', 'EOG09263MGE', 'EOG09264PDD', 'EOG09263IMF', 'EOG092648K5', 'EOG092602MO', 'EOG09263C55', 'EOG09260EZT', 'EOG09264NC7', 'EOG09262JAT', 'EOG09260E8K', 'EOG0926133I', 'EOG092612CC', 'EOG092600SK', 'EOG092648LP', 'EOG09260VTN', 'EOG092648VW', 'EOG09264O6J', 'EOG0926514P', 'EOG09263W7L', 'EOG09262LYR', 'EOG09265PQX', 'EOG09263QH4', 'EOG09260DXP', 'EOG09260WU6', 'EOG09263NE7', 'EOG09265G9U', 'EOG0926388H', 'EOG0926425H', 'EOG09264HTG', 'EOG09260EAZ', 'EOG0926357F', 'EOG09262JWJ', 'EOG092608RH', 'EOG092629WA', 'EOG092657UN', 'EOG09265PUI', 'EOG0926419M', 'EOG09264JHE', 'EOG09263OQH', 'EOG092638CT', 'EOG09262CBI', 'EOG09262X01', 'EOG092640BS', 'EOG09264DY4', 'EOG09264Y0W', 'EOG092619VG', 'EOG092651FJ', 'EOG09261LPY', 'EOG09261OXD', 'EOG09262ESR', 'EOG0926251E', 'EOG0926310O', 'EOG09264T8I', 'EOG092602FH', 'EOG092607OQ', 'EOG09265NHW', 'EOG09264331', 'EOG09261666', 'EOG09260LRX', 'EOG09260A27', 'EOG09262N10', 'EOG09261B18', 'EOG09260SAH', 'EOG09260ERO', 'EOG09261Y04', 'EOG09261EU7', 'EOG09263EVJ', 'EOG09263MEM', 'EOG09260274', 'EOG09264OYZ', 'EOG09264DT4', 'EOG09263OZR', 'EOG09261W90', 'EOG0926347W', 'EOG09264NEF', 'EOG09264LC7', 'EOG09263FR7', 'EOG09260AQB', 'EOG0926306O', 'EOG09260QVP', 'EOG09261JUE', 'EOG09261I1G', 'EOG09264XOZ', 'EOG09260SIZ', 'EOG09264LBC', 'EOG09262V8E', 'EOG09262GXD', 'EOG09263C4C', 'EOG09260RRC', 'EOG092640WA', 'EOG09263A5D', 'EOG09265313', 'EOG092632WW', 'EOG09263U08', 'EOG09265SHM', 'EOG09260SL3', 'EOG092619GP', 'EOG09263690', 'EOG09263ULA', 'EOG09264RIE', 'EOG09262CMP', 'EOG0926073O', 'EOG09264NJ1', 'EOG09263OAE', 'EOG09263BE5', 'EOG09260RS7', 'EOG09260NY2', 'EOG09261O7R', 'EOG092653YS', 'EOG092657YR', 'EOG09260WUA', 'EOG09262JZW', 'EOG09263LNF', 'EOG09264THP', 'EOG09260Z3X', 'EOG0926115P', 'EOG09261WVT', 'EOG09262E4Q', 'EOG09265I60', 'EOG09262DUV', 'EOG09261C0G', 'EOG09261XNJ', 'EOG092658X5', 'EOG092658CI', 'EOG09263A3Y', 'EOG09263IQ5', 'EOG092654LJ', 'EOG09260KGS', 'EOG09262MXH', 'EOG092611HB', 'EOG09263J6Z', 'EOG09260BRA', 'EOG09264903', 'EOG09262GVX', 'EOG09263R4M', 'EOG09264IIZ', 'EOG09262NNS', 'EOG092606AD', 'EOG09263ZW6', 'EOG09263JTO', 'EOG092651K1', 'EOG09263Q8J', 'EOG09261X9E', 'EOG09262PAY', 'EOG09262CUO', 'EOG09261B3Q', 'EOG09263L9T', 'EOG09260W9L', 'EOG09263X1F', 'EOG09263YFX', 'EOG09260DUR', 'EOG09261DW8', 'EOG092654VM', 'EOG09260NJW', 'EOG09260JTZ', 'EOG09263YFH', 'EOG09260JED', 'EOG092613QA', 'EOG09263KB4', 'EOG09262GLP', 'EOG09265GGX', 'EOG092625OH', 'EOG09265KSE', 'EOG09262FE3', 'EOG09264I14', 'EOG09264L0C', 'EOG09263E5F', 'EOG0926448Q', 'EOG09264KIV', 'EOG092645G0', 'EOG09261JCQ', 'EOG09265FI4', 'EOG09265KPR', 'EOG09260GIX', 'EOG09264904', 'EOG09260P2K', 'EOG09262WXK', 'EOG09264COX', 'EOG09260SRF', 'EOG09265IBC', 'EOG09264I9J', 'EOG092656JA', 'EOG0926213Z', 'EOG092635YY', 'EOG09264AWW', 'EOG09264V2U', 'EOG092645L9', 'EOG092624SJ', 'EOG09260075', 'EOG09260AZA', 'EOG092654XA', 'EOG092620CR', 'EOG09263X0V', 'EOG092655M5', 'EOG092648Q0', 'EOG09260R84', 'EOG092626HU', 'EOG09263XVS', 'EOG092600NM', 'EOG092659OC', 'EOG09263BG5', 'EOG09264T0J', 'EOG09263GSR', 'EOG092652YI', 'EOG092654QZ', 'EOG09264ZXA', 'EOG09263SFX', 'EOG09262XMN', 'EOG09262645', 'EOG09264CP4', 'EOG092600S9', 'EOG09264W46', 'EOG09262XGK', 'EOG09263E9V', 'EOG09262UTQ', 'EOG09264NNY', 'EOG09265KNL', 'EOG09265PJ3', 'EOG09260HS3', 'EOG092605KN', 'EOG092634B5', 'EOG09263HD8', 'EOG0926142Y', 'EOG09261YLQ', 'EOG09262WHU', 'EOG09265E6R', 'EOG09261660', 'EOG092619RJ', 'EOG09264XF3', 'EOG09263E87', 'EOG092649XV', 'EOG09264WF4', 'EOG09261HQU', 'EOG09261MPU', 'EOG09260V8Q', 'EOG09265G5C', 'EOG0926049A', 'EOG092643Y5', 'EOG0926079Q', 'EOG09262DPL', 'EOG092621GA', 'EOG09262XRU', 'EOG09263PXH', 'EOG092624X0', 'EOG092652TN', 'EOG09260OE9', 'EOG09261NW2', 'EOG092653KS', 'EOG09260KNI', 'EOG09265BE5', 'EOG09264G7L', 'EOG09261F73', 'EOG09264LJU', 'EOG092639H5', 'EOG09264RW6', 'EOG092620E4', 'EOG09263GIG', 'EOG09260OLB', 'EOG09263JW5', 'EOG092620U5', 'EOG09262QTY', 'EOG092606CY', 'EOG09264OBA', 'EOG092653SU', 'EOG092643VB', 'EOG09260N2T', 'EOG092608AE', 'EOG0926499W', 'EOG0926049S', 'EOG09262D4G', 'EOG09264YEG', 'EOG09265JVH', 'EOG09265BTC', 'EOG092644O2', 'EOG09263CUQ', 'EOG0926004Z', 'EOG09261127', 'EOG09262QJW', 'EOG09263RF8', 'EOG09264P4J', 'EOG09265DRM', 'EOG09260JT9', 'EOG09260A98', 'EOG09265DWT', 'EOG092615SM', 'EOG09264873', 'EOG09263CGP', 'EOG09263L52', 'EOG0926195C', 'EOG09260OQ8', 'EOG092602OP', 'EOG09262A65', 'EOG09261OIA', 'EOG09260GR8', 'EOG092646EZ', 'EOG09260W52', 'EOG092600T4', 'EOG09260B3H', 'EOG09264XM2', 'EOG092644ZU', 'EOG09264R0D', 'EOG09261B14', 'EOG09260J53', 'EOG092647L7', 'EOG092632FC', 'EOG09261476', 'EOG09261S7X', 'EOG09262BCZ', 'EOG09264YIJ', 'EOG0926386D', 'EOG092620IA', 'EOG0926384F', 'EOG092640IZ', 'EOG0926436T', 'EOG09264V30', 'EOG09262D0D', 'EOG092624RX', 'EOG09264G4X', 'EOG09263SZM', 'EOG09260RRN', 'EOG09263AZP', 'EOG09261FAB', 'EOG092646VF', 'EOG09263HED', 'EOG09261W2O', 'EOG09264CND', 'EOG0926390Q', 'EOG09261K9J', 'EOG09264BIX', 'EOG092617AN', 'EOG09260JJW', 'EOG09262ZZ8', 'EOG09264IQ7', 'EOG092644Z6', 'EOG09261JR0', 'EOG092605FC', 'EOG09263CLY', 'EOG092643NE', 'EOG092652Y6', 'EOG0926213Q', 'EOG092610ZY', 'EOG09264IDN', 'EOG092643JW', 'EOG09260JNY', 'EOG0926477X', 'EOG09265GYD', 'EOG092605QM', 'EOG09262KXK', 'EOG09263J3H', 'EOG09264HOY', 'EOG092617RN', 'EOG09261CXQ', 'EOG09262E3Q', 'EOG092614UB', 'EOG092652NQ', 'EOG092627F1', 'EOG09263S2P', 'EOG092612MY', 'EOG09262SWJ', 'EOG09264W71', 'EOG09264PD5', 'EOG09261Q5L', 'EOG09260KWP', 'EOG09260RGH', 'EOG092634G9', 'EOG09263RDD', 'EOG09264B74', 'EOG09261H5E', 'EOG09262YQG', 'EOG09260S5R', 'EOG09261UMG', 'EOG09264TQ5', 'EOG092603JK', 'EOG09264F1U', 'EOG09260FL2', 'EOG09261S0S', 'EOG09263DFA', 'EOG0926077L', 'EOG09260LI6', 'EOG09263IT0', 'EOG09260W8D', 'EOG09264IG5', 'EOG09261EMF', 'EOG09262A8G', 'EOG09260Y2Q', 'EOG09265D7J', 'EOG09260Z5E', 'EOG09261031', 'EOG092621YU', 'EOG092630MJ', 'EOG092634M1', 'EOG09263FAK', 'EOG09261N64', 'EOG09260NE0', 'EOG09262X74', 'EOG09260BSW', 'EOG092609YT', 'EOG09263X4B', 'EOG09264CP8', 'EOG092638XA', 'EOG09264OXC', 'EOG092658WS', 'EOG09260S3L', 'EOG09262H50', 'EOG092608AV', 'EOG09263RY2', 'EOG092631QQ', 'EOG09260TDY', 'EOG09262L1P', 'EOG092656RK', 'EOG09263Z41', 'EOG092636Y6', 'EOG09264DMU', 'EOG09261NLR', 'EOG09262X8R', 'EOG09264SUZ', 'EOG092657H8', 'EOG09264OM7', 'EOG0926591L', 'EOG09260FFP', 'EOG09263H5H', 'EOG09264B3O', 'EOG09264JW6', 'EOG09261YV6', 'EOG09262E7W', 'EOG09261A3K', 'EOG092646PE', 'EOG09260B3X', 'EOG09260JPV', 'EOG0926312D', 'EOG09260C2V', 'EOG09264VRO', 'EOG092628LW', 'EOG09264O2D', 'EOG09260LJ8', 'EOG092609JB', 'EOG09262HA6', 'EOG09264DOU', 'EOG09260U6R', 'EOG09261PJZ', 'EOG09260EPQ', 'EOG09261UPM', 'EOG092649QJ', 'EOG09261DJC', 'EOG09260JO5', 'EOG09263GUT', 'EOG09264GMT', 'EOG09264ENO', 'EOG09263WB5', 'EOG09260K4V', 'EOG09261VI3', 'EOG09260K24', 'EOG09260XOG', 'EOG09265BTA', 'EOG092628HC', 'EOG09264DYD', 'EOG09262H1X', 'EOG09264DJ8', 'EOG09264XKX', 'EOG092600W1', 'EOG09262VOC', 'EOG09261LV7', 'EOG092652KR', 'EOG09261VUC', 'EOG09262KZ3', 'EOG092631UM', 'EOG09262IZO', 'EOG09262UB3', 'EOG09262ILV', 'EOG09261MOX', 'EOG09263X22', 'EOG09263FKC', 'EOG09260WGT', 'EOG09264X41', 'EOG09263KEE', 'EOG09264JQ1', 'EOG09260EQD', 'EOG09261V9P', 'EOG092656IY', 'EOG09265CQO', 'EOG09264XTW', 'EOG09260H81', 'EOG09263ZSC', 'EOG092655L0', 'EOG092648XW', 'EOG09264ZDJ', 'EOG09264LH2', 'EOG09263FTE', 'EOG09265A4E', 'EOG092604ML', 'EOG092615IE', 'EOG09262914', 'EOG09263HYJ', 'EOG09262Q8D', 'EOG09263JFQ', 'EOG09264C3V', 'EOG092612LP', 'EOG09260NC6', 'EOG09264E6Z', 'EOG09264V51', 'EOG092643QM', 'EOG09262Z0M', 'EOG09265AT5', 'EOG09264MZ1', 'EOG092606AJ', 'EOG09264829', 'EOG09264SQJ', 'EOG0926131E', 'EOG09260MRU', 'EOG09263274', 'EOG0926423H', 'EOG09262B1U', 'EOG09260VCG', 'EOG09264D7Y', 'EOG09264L6D', 'EOG09264BJC', 'EOG09262MFL', 'EOG09263OWL', 'EOG092645U1', 'EOG09260NWN', 'EOG0926025H', 'EOG09262DC1', 'EOG09262K8C', 'EOG092611G7', 'EOG09260LBU', 'EOG09262JZK', 'EOG092648U2', 'EOG09264OBO', 'EOG09263EBB', 'EOG09262IZ6', 'EOG09264G1F', 'EOG09260XQV', 'EOG09262QRH', 'EOG09264A8D', 'EOG09261B6Y', 'EOG09265DDU', 'EOG09265GXF', 'EOG09260SJV', 'EOG09265E8A', 'EOG09265H9T', 'EOG09261M78', 'EOG09265JA7', 'EOG0926137U', 'EOG09261FM4', 'EOG09261QXC', 'EOG09264UJF', 'EOG09264XT5', 'EOG09261MMR', 'EOG09262PMC', 'EOG09262HQM', 'EOG09263JZO', 'EOG09265ANI', 'EOG09261QR8', 'EOG09263XZN', 'EOG09260EOI', 'EOG09262N0U', 'EOG09262PIH', 'EOG092652ZZ', 'EOG09262A6N', 'EOG092642I5', 'EOG09260VTA', 'EOG09260E2O', 'EOG092600T9', 'EOG09265K60', 'EOG09263LR1', 'EOG092614E6', 'EOG09262M2J', 'EOG0926140Q', 'EOG092621CK', 'EOG092619EA', 'EOG09264LKR', 'EOG092621CP', 'EOG09262M7B', 'EOG09260779', 'EOG09262N47', 'EOG09265RGI', 'EOG09260XL0', 'EOG09261ZJR', 'EOG09261IEH', 'EOG09265C25', 'EOG09264RBX', 'EOG09263EDC', 'EOG092629U5', 'EOG09265822', 'EOG09265BBJ', 'EOG09261404', 'EOG09263RW3', 'EOG09260375', 'EOG09261LEU', 'EOG09262341', 'EOG092631IR', 'EOG092618M2', 'EOG09264U78', 'EOG09263CG4', 'EOG092645LS', 'EOG09262VVF', 'EOG09261XAF', 'EOG09261N20', 'EOG09260UXC', 'EOG09265HPJ', 'EOG09261V2P', 'EOG09262516', 'EOG09260LS9', 'EOG09260KCB', 'EOG09260V5Q', 'EOG09260FKU', 'EOG09264OML', 'EOG09262V9N', 'EOG09265CN0', 'EOG09260HPO', 'EOG092644DY', 'EOG09260XH5', 'EOG09262F7P', 'EOG09263A64', 'EOG0926505R', 'EOG09264RJ7', 'EOG09261QR5', 'EOG09263ZBJ', 'EOG09261PUF', 'EOG09262SR7', 'EOG09260RNZ', 'EOG09260WYQ', 'EOG09260BYL', 'EOG092644X6', 'EOG09263G3M', 'EOG0926510L', 'EOG092606WZ', 'EOG092638EN', 'EOG0926489S', 'EOG09264JK1', 'EOG092608T8', 'EOG09264RR2', 'EOG09264O9F', 'EOG092600X0', 'EOG09264EGS', 'EOG092634ZL', 'EOG09264ABT', 'EOG09264FYY', 'EOG09260M87', 'EOG09264B2P', 'EOG092630YS', 'EOG09264HN5', 'EOG092644TW', 'EOG09262G8Y', 'EOG09263MNN', 'EOG09263BGW', 'EOG09261ONU', 'EOG09264L06', 'EOG09261KHB', 'EOG092603D0', 'EOG09262J7K', 'EOG09261M4S', 'EOG09262WSH', 'EOG09263Z8I', 'EOG09265EOF', 'EOG092604MJ', 'EOG09261V03', 'EOG09260HSP', 'EOG092608WU', 'EOG09260NHB', 'EOG092638RC', 'EOG0926158Y', 'EOG09260GF5', 'EOG09263OCO', 'EOG09260AEC', 'EOG092610LQ', 'EOG092603SM', 'EOG09260DFH', 'EOG092624UF', 'EOG09265I7S', 'EOG092614DJ', 'EOG09263CG2', 'EOG09263I7I', 'EOG09260N53', 'EOG09264BWL', 'EOG09260J8F', 'EOG09264T1U', 'EOG09265KF9', 'EOG09264KTU', 'EOG092613R2', 'EOG09264TJN', 'EOG0926354S', 'EOG0926420U', 'EOG09261ZXZ', 'EOG092654O3', 'EOG09265FGA', 'EOG09260MBW', 'EOG09264CST', 'EOG092619NK', 'EOG09263WZ2', 'EOG09264UD3', 'EOG092650I8', 'EOG09261FH7', 'EOG09261225', 'EOG09264KK7', 'EOG09264YKY', 'EOG09262P1W', 'EOG09262C5Z', 'EOG09262KUJ', 'EOG09264L3W', 'EOG09264G04', 'EOG09260WUS', 'EOG09264XPY', 'EOG09264FVQ', 'EOG09260DUW', 'EOG092653LT', 'EOG09265LEG', 'EOG092656CM', 'EOG09264Z1B', 'EOG09263TQ5', 'EOG092658ZO', 'EOG09260OM6', 'EOG092635SS', 'EOG09261KRX', 'EOG092647CM', 'EOG09265BG5', 'EOG09264IKZ', 'EOG09261UOJ', 'EOG09263UWJ', 'EOG09260HMA', 'EOG09262Q6S', 'EOG09260APA', 'EOG09264MN3', 'EOG09265591', 'EOG09265ER6', 'EOG09262I0R', 'EOG09260931', 'EOG092633QB', 'EOG09261RFF', 'EOG092603KJ', 'EOG09262BHE', 'EOG09262IP2', 'EOG09264DIM', 'EOG09262E98', 'EOG092649VA', 'EOG09264YHS', 'EOG09260PJ9', 'EOG092628SP', 'EOG09264K2W', 'EOG09264IV9', 'EOG09261W1K', 'EOG09260JDM', 'EOG09260H6E', 'EOG092613UB', 'EOG09264P74', 'EOG09261V87', 'EOG092624JL', 'EOG09262O0R', 'EOG09262PZ9', 'EOG09264GQZ', 'EOG09261HB6', 'EOG09264IOS', 'EOG09262MOO', 'EOG09261CM0', 'EOG09265K4K', 'EOG09265AL8', 'EOG09261EY9', 'EOG092651HW', 'EOG09260B65', 'EOG092629WJ', 'EOG092628FW', 'EOG09260OZU', 'EOG09261OKK', 'EOG092604S1', 'EOG092631MU', 'EOG09264USX', 'EOG09260TPT', 'EOG09261G4Z', 'EOG09261RWU', 'EOG09262LQT', 'EOG092605VU'] were not found in the ancestral_variants file
INFO Running tblastn, writing output to /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/blast_output/tblastn_awesome_busco_missing_and_frag_rerun.tsv...
INFO [tblastn] Warning: [tblastn] Query is Empty!
INFO Getting coordinates for candidate regions...
INFO ****** Step 2/3, current time: 06/29/2022 21:25:01 ******
INFO Training Augustus using Single-Copy Complete BUSCOs:
INFO 06/29/2022 21:25:01 => Converting predicted genes to short genbank files...
INFO 06/29/2022 21:25:07 => All files converted to short genbank files, now running the training scripts...
INFO Pre-Augustus scaffold extraction...
INFO Re-running Augustus with the new metaparameters, number of target BUSCOs: 1135
INFO 06/29/2022 21:25:09 => 0% of predictions performed (0 to be done)
INFO 06/29/2022 21:25:09 => 100% of predictions performed
INFO Extracting predicted proteins...
INFO ****** Step 3/3, current time: 06/29/2022 21:25:09 ******
INFO Running HMMER to confirm orthology of predicted proteins:
INFO 06/29/2022 21:25:09 => 0% of predictions performed (0 to be done)
INFO 06/29/2022 21:25:09 => 100% of predictions performed
INFO Results:
INFO C:13.5%[S:13.3%,D:0.2%],F:0.1%,M:86.4%,n:1312
INFO 177 Complete BUSCOs (C)
INFO 175 Complete and single-copy BUSCOs (S)
INFO 2 Complete and duplicated BUSCOs (D)
INFO 1 Fragmented BUSCOs (F)
INFO 1134 Missing BUSCOs (M)
INFO 1312 Total BUSCO groups searched

INFO BUSCO analysis done with WARNING(s). Total running time: 300.09778451919556 seconds
INFO Results written in /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/

INFO ****************** Start a BUSCO 2.0 analysis, current time: 06/29/2022 21:25:30 ******************
INFO The lineage dataset is: dikarya_odb9 (eukaryota)
INFO Mode is: proteins
INFO To reproduce this run: python /hpc/home/idm7/miniconda3/envs/annotate/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco_augustus.proteins.fasta -o awesome_busco -l /hpc/group/bio1/ian/envs/funannotate_db/dikarya/ -m proteins -c 10 -sp anidulans
INFO Check dependencies...
INFO Check input file...
INFO Temp directory is ./tmp/
INFO Running HMMER on the proteins:
INFO 06/29/2022 21:25:30 => 0% of predictions performed (1312 to be done)
INFO 06/29/2022 21:25:32 => 10% of predictions performed (134/1312 candidate proteins)
INFO 06/29/2022 21:25:33 => 20% of predictions performed (263/1312 candidate proteins)
INFO 06/29/2022 21:25:35 => 30% of predictions performed (396/1312 candidate proteins)
INFO 06/29/2022 21:25:38 => 40% of predictions performed (525/1312 candidate proteins)
INFO 06/29/2022 21:25:41 => 50% of predictions performed (659/1312 candidate proteins)
INFO 06/29/2022 21:25:44 => 60% of predictions performed (791/1312 candidate proteins)
INFO 06/29/2022 21:25:47 => 70% of predictions performed (922/1312 candidate proteins)
INFO 06/29/2022 21:25:52 => 80% of predictions performed (1054/1312 candidate proteins)
INFO 06/29/2022 21:25:56 => 90% of predictions performed (1181/1312 candidate proteins)
INFO 06/29/2022 21:26:00 => 100% of predictions performed
INFO Results:
INFO C:13.3%[S:13.3%,D:0.0%],F:0.0%,M:86.7%,n:1312
INFO 175 Complete BUSCOs (C)
INFO 175 Complete and single-copy BUSCOs (S)
INFO 0 Complete and duplicated BUSCOs (D)
INFO 0 Fragmented BUSCOs (F)
INFO 1137 Missing BUSCOs (M)
INFO 1312 Total BUSCO groups searched

INFO BUSCO analysis done. Total running time: 33.35695195198059 seconds
INFO Results written in /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco_proteins/run_awesome_busco/

`

OS/Install Information

Installed using mamba on computing cluster running Red Hat Enterprise Linux 8.

`-------------------------------------------------------
Checking dependencies for 1.8.11

You are running Python v 3.8.12. Now checking python packages...
biopython: 1.77
goatools: 1.2.3
matplotlib: 3.4.3
natsort: 8.1.0
numpy: 1.23.0
pandas: 1.4.3
psutil: 5.9.1
requests: 2.28.1
scikit-learn: 1.1.1
scipy: 1.8.1
seaborn: 0.11.2
All 11 python packages installed

You are running Perl v b'5.026002'. Now checking perl modules...
Carp: 1.38
Clone: 0.42
DBD::SQLite: 1.64
DBD::mysql: 4.046
DBI: 1.642
DB_File: 1.855
Data::Dumper: 2.173
File::Basename: 2.85
File::Which: 1.23
Getopt::Long: 2.5
Hash::Merge: 0.300
JSON: 4.02
LWP::UserAgent: 6.39
Logger::Simple: 2.0
POSIX: 1.76
Parallel::ForkManager: 2.02
Pod::Usage: 1.69
Scalar::Util::Numeric: 0.40
Storable: 3.15
Text::Soundex: 3.05
Thread::Queue: 3.12
Tie::File: 1.02
URI::Escape: 3.31
YAML: 1.29
local::lib: 2.000024
threads: 2.15
threads::shared: 1.56
All 27 Perl modules installed

Checking Environmental Variables...
$FUNANNOTATE_DB=/hpc/group/bio1/ian/envs/funannotate_db
$PASAHOME=/hpc/home/idm7/miniconda3/envs/annotate/opt/pasa-2.5.2
$TRINITY_HOME=/hpc/home/idm7/miniconda3/envs/annotate/opt/trinity-2.8.5
$EVM_HOME=/hpc/home/idm7/miniconda3/envs/annotate/opt/evidencemodeler-1.1.1
$AUGUSTUS_CONFIG_PATH=/hpc/home/idm7/miniconda3/envs/annotate/config/
$GENEMARK_PATH=/hpc/group/bio1/ian/envs/funannotate/gmes_petap
All 6 environmental variables are set

Checking external dependencies...
PASA: 2.5.2
CodingQuarry: 2.0
Trinity: 2.8.5
augustus: 3.3.3
bamtools: bamtools 2.5.1
bedtools: bedtools v2.30.0
blat: BLAT v35
diamond: 2.0.15
ete3: 3.1.2
exonerate: exonerate 2.4.0
fasta: no way to determine
glimmerhmm: 3.0.4
gmap: 2021-08-25
hisat2: 2.2.1
hmmscan: HMMER 3.3.2 (Nov 2020)
hmmsearch: HMMER 3.3.2 (Nov 2020)
java: 11.0.1-internal
kallisto: 0.46.1
mafft: v7.505 (2022/Apr/10)
makeblastdb: makeblastdb 2.2.31+
minimap2: 2.24-r1122
pigz: pigz 2.7
proteinortho: 6.1.0
pslCDnaFilter: no way to determine
salmon: salmon 0.14.1
samtools: samtools 1.12
snap: 2006-07-28
stringtie: 2.2.1
tRNAscan-SE: 2.0.9 (July 2021)
tantan: tantan 39
tbl2asn: no way to determine, likely 25.X
tblastn: tblastn 2.2.31+
trimal: trimAl v1.4.rev15 build[2013-12-17]
trimmomatic: 0.39
ERROR: emapper.py not installed
ERROR: gmes_petap.pl not installed
ERROR: signalp not installed`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions