Package org.snpeff.snpEffect.factory
Class SnpEffPredictorFactory
- java.lang.Object
-
- org.snpeff.snpEffect.factory.SnpEffPredictorFactory
-
- Direct Known Subclasses:
SnpEffPredictorFactoryFeatures,SnpEffPredictorFactoryGenesFile,SnpEffPredictorFactoryGff,SnpEffPredictorFactoryKnownGene,SnpEffPredictorFactoryRefSeq
public abstract class SnpEffPredictorFactory extends java.lang.ObjectThis class creates a SnpEffectPredictor from a file (or a set of files) and a configuration- Author:
- pcingola
-
-
Field Summary
Fields Modifier and Type Field Description static intMARKstatic intMIN_TOTAL_FRAME_COUNT
-
Constructor Summary
Constructors Constructor Description SnpEffPredictorFactory(Config config, int inOffset)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected voidadd(Cds cds)protected voidadd(Chromosome chromo)protected Exonadd(Exon exon)Add an exonprotected voidadd(Gene gene)Add a Geneprotected voidadd(Marker marker)Add a generic Markerprotected voidadd(Transcript tr)Add a transcriptprotected voidaddMarker(Marker marker, boolean unique)Add a marker to the collectionprotected voidaddSequences(java.lang.String chr, java.lang.String chrSeq)Add genomic reference sequencesprotected voidadjustChromosomes()Adjust chromosome length using gene information This is used when the sequence is not available (which makes sense on test-cases and debugging only)protected voidadjustTranscripts()Adjust transcripts: recalculate start, end, strand, etc.protected voidbeforeExonSequences()Perform some actions before reading sequencesprotected voidcodingFromCds()Only coding transcripts have CDS: Make sure that transcripts having CDS are protein coding It might not be always "precise" though: $ grep CDS genes.gtf | cut -f 2 | ~/snpEff/scripts/uniqCount.pl 113 IG_C_gene 64 IG_D_gene 24 IG_J_gene 366 IG_V_gene 21 TR_C_gene 3 TR_D_gene 82 TR_J_gene 296 TR_V_gene 461 non_stop_decay 63322 nonsense_mediated_decay 905 polymorphic_pseudogene 34 processed_transcript 1340112 protein_codingprotected voidcollapseZeroLenIntrons()Collapse exons having zero size introns between themabstract SnpEffectPredictorcreate()protected voidcreateRandSequences()Create random sequences for exons Note: This is only used for test cases!protected voiddeleteRedundant()Consolidate transcripts: If two exons are one right next to the other, join them E.g.protected voidexonsFromCds()Create exons from CDS infoprotected voidexonsFromCds(Transcript tr)Create exons from CDS info WARNING: We might end up with redundant exons if some exons existed before this processprotected GenefindGene(java.lang.String id)protected GenefindGene(java.lang.String geneId, java.lang.String id)protected MarkerfindMarker(java.lang.String id)protected TranscriptfindTranscript(java.lang.String id)protected TranscriptfindTranscript(java.lang.String trId, java.lang.String id)protected ChromosomegetOrCreateChromosome(java.lang.String chromoName)Get a chromosome.java.util.Map<java.lang.String,java.lang.String>getProteinByTrId()protected intparsePosition(java.lang.String posStr)Parse a string as a 'position'.protected voidreadExonSequences()Read exon sequences from a FASTA fileprotected voidreplaceTranscript(Transcript trOld, Transcript trNew)voidsetCircularCorrectLargeGap(boolean circularCorrectLargeGap)voidsetCreateRandSequences(boolean createRandSequences)voidsetDebug(boolean debug)voidsetFastaFile(java.lang.String fastaFile)voidsetFileName(java.lang.String fileName)voidsetRandom(java.util.Random random)voidsetReadSequences(boolean readSequences)Read sequences? Note: This is only used for debugging and testingvoidsetStoreSequences(boolean storeSequences)voidsetVerbose(boolean verbose)protected java.lang.StringshowChromoNamesDifferences()Shw differences in chromosome names
-
-
-
Field Detail
-
MARK
public static final int MARK
- See Also:
- Constant Field Values
-
MIN_TOTAL_FRAME_COUNT
public static int MIN_TOTAL_FRAME_COUNT
-
-
Constructor Detail
-
SnpEffPredictorFactory
public SnpEffPredictorFactory(Config config, int inOffset)
-
-
Method Detail
-
add
protected void add(Cds cds)
-
add
protected void add(Chromosome chromo)
-
add
protected Exon add(Exon exon)
Add an exon- Parameters:
exon-- Returns:
- exon added. Note: If the exon exists with the same ID, return old exon. If exon exists with same ID and same coordiates, add a new exon with different ID.
-
add
protected void add(Gene gene)
Add a Gene
-
add
protected void add(Marker marker)
Add a generic Marker
-
add
protected void add(Transcript tr)
Add a transcript
-
addMarker
protected void addMarker(Marker marker, boolean unique)
Add a marker to the collection
-
addSequences
protected void addSequences(java.lang.String chr, java.lang.String chrSeq)Add genomic reference sequences
-
adjustChromosomes
protected void adjustChromosomes()
Adjust chromosome length using gene information This is used when the sequence is not available (which makes sense on test-cases and debugging only)
-
adjustTranscripts
protected void adjustTranscripts()
Adjust transcripts: recalculate start, end, strand, etc.
-
beforeExonSequences
protected void beforeExonSequences()
Perform some actions before reading sequences
-
codingFromCds
protected void codingFromCds()
Only coding transcripts have CDS: Make sure that transcripts having CDS are protein coding It might not be always "precise" though: $ grep CDS genes.gtf | cut -f 2 | ~/snpEff/scripts/uniqCount.pl 113 IG_C_gene 64 IG_D_gene 24 IG_J_gene 366 IG_V_gene 21 TR_C_gene 3 TR_D_gene 82 TR_J_gene 296 TR_V_gene 461 non_stop_decay 63322 nonsense_mediated_decay 905 polymorphic_pseudogene 34 processed_transcript 1340112 protein_coding
-
collapseZeroLenIntrons
protected void collapseZeroLenIntrons()
Collapse exons having zero size introns between them
-
create
public abstract SnpEffectPredictor create()
-
createRandSequences
protected void createRandSequences()
Create random sequences for exons Note: This is only used for test cases!
-
deleteRedundant
protected void deleteRedundant()
Consolidate transcripts: If two exons are one right next to the other, join them E.g. exon1:1234-2345, exon2:2346-2400 => exon:1234-2400 This happens mostly in GTF files, where the stop-codon is specified separated from the exon info.
-
exonsFromCds
protected void exonsFromCds()
Create exons from CDS info
-
exonsFromCds
protected void exonsFromCds(Transcript tr)
Create exons from CDS info WARNING: We might end up with redundant exons if some exons existed before this process- Parameters:
tr- : Transcript with CDS info, but no exons
-
findGene
protected Gene findGene(java.lang.String id)
-
findGene
protected Gene findGene(java.lang.String geneId, java.lang.String id)
-
findMarker
protected Marker findMarker(java.lang.String id)
-
findTranscript
protected Transcript findTranscript(java.lang.String id)
-
findTranscript
protected Transcript findTranscript(java.lang.String trId, java.lang.String id)
-
getOrCreateChromosome
protected Chromosome getOrCreateChromosome(java.lang.String chromoName)
Get a chromosome. If it doesn't exist, create it
-
getProteinByTrId
public java.util.Map<java.lang.String,java.lang.String> getProteinByTrId()
-
parsePosition
protected int parsePosition(java.lang.String posStr)
Parse a string as a 'position'. Note: It subtracts 'inOffset' so that all coordinates are zero-based
-
readExonSequences
protected void readExonSequences()
Read exon sequences from a FASTA file
-
replaceTranscript
protected void replaceTranscript(Transcript trOld, Transcript trNew)
-
setCircularCorrectLargeGap
public void setCircularCorrectLargeGap(boolean circularCorrectLargeGap)
-
setCreateRandSequences
public void setCreateRandSequences(boolean createRandSequences)
-
setDebug
public void setDebug(boolean debug)
-
setFastaFile
public void setFastaFile(java.lang.String fastaFile)
-
setFileName
public void setFileName(java.lang.String fileName)
-
setRandom
public void setRandom(java.util.Random random)
-
setReadSequences
public void setReadSequences(boolean readSequences)
Read sequences? Note: This is only used for debugging and testing
-
setStoreSequences
public void setStoreSequences(boolean storeSequences)
-
setVerbose
public void setVerbose(boolean verbose)
-
showChromoNamesDifferences
protected java.lang.String showChromoNamesDifferences()
Shw differences in chromosome names
-
-