FAQ 3: I found error report with GFF?

 

align contigs to cds and calculate the genome wide ka/ks values. I am 
facing some errors and I was wandering if you can help with it.

The error is as follows: i see a lot of lines which say

Use of "goto" to jump into a construct is deprecated at GKas.pl line 
682, <GFF> line ....
The program generates the blat_rel.query and new_blatout files and it 
exits.
I exported the mouse cds (coding sequence) using ensemble biomart, and I 
used the mouse GTF file from ensemble, I dont know if this will give 
problem.
The command I used is as follows:

perl GKas.pl 
-query_seq="/home/Packages/GKas/mm_all_cds_mart_export2.cds" 
-gff="/home/Packages/GKas/Mus_musculus.GRCm38.68.gtf" 
-hit_seq="/home/Packages/GKas/scaffolds.fasta" -spe=2 
-codeml="/home/Packages/paml4.6/bin/codeml" 
-detail="/home/Packages/GKas/eske_detail" 
-kaks_file="/home/Packages/GKas/kaks" 
-problem_loc="/home/Packages/GKas/problem_loc"


 

I'm glad you try to use it. In my experiences, the most problem cause it error is the ambiguous of the gff/gtf file format. i suggest you to pre-prepare the gtf file, make sure the gtf/gff file was like the attach type(example.gff3).

the key point of the gtf/gff file is that:

  1. make sure the third list include exon, CDS, gene, mRNA;
  2. make sure the eighth list only include the IDs
    • make sure the ID for CDS, exon and mRNA is the same one;
    • you 'd better to keep a mapping-list, this can help you to know the connection among ID for CDS, exon, mRNA and the gene ID, this is optinal for convenient not necessary for the GKas.
  3. when the strand is "-", make sure the CDS are sorted from largest to smallest (the rice gff3 file was in this format, but some others maybe are not, the pipe line was first generated accroding the rice data). otherwise, it will cause wrong result since the sequences are combine in wrong order.
  4. make sure the CDS fasta file have same ID with the CDS ID.

after this, you can set the -spe=6, this should be OK then .