Because ribosomes step three nucleotides in each cycle of translation elongation, in many ribosome profiling datasets a triplet periodicity is observable in the distribution of ribosome-protected footprints.
In a good dataset, 70-90% of the reads on a codon fall within the first of the three codon positions. This allows deduction of translation reading frames, if the reading frame is not known a priori. See [IGNW09] for more details.
- Read phasing for each read length
- Plot of phasing by read length
where OUTBASE is supplied by the user.
Optional. ROI file of maximal spanning windows surrounding start codons, from
metagene generatesubprogram. Using this instead of –annotation_files prevents double-counting of codons when multiple transcript isoforms exist for a gene. See the documentation for metagene for more info about ROI files.If an ROI file is not given, supply an annotation with
Required. Basename for output files
show this help message and exit
Codons before and after start codon to ignore (Default: 5)
Suppress all warning messages. Cannot use with ‘-v’.
Increase verbosity. With ‘-v’, show every warning. With ‘-vv’, turn warnings into exceptions. Cannot use with ‘-q’. (Default: show each type of warning once)
Annotation file options (one or more annotation files required)¶
Open one or more genome annotation files
--annotation_files infile.[BED | BigBed | GTF2 | GFF3] [infile.[BED | BigBed | GTF2 | GFF3] ...]
Zero or more annotation files (max 1 file if BigBed)
Format of annotation_files (Default: GTF2). Note: GFF3 assembly assumes SO v.2.5.2 feature ontologies, which may or may not match your specific file.
If supplied, coding regions will be extended by 3 nucleotides at their 3’ ends (except for GTF2 files that explicitly include stop_codon features). Use if your annotation file excludes stop codons from CDS.
annotation_files are tabix-compressed and indexed (Default: False). Ignored for BigBed files.
annotation_files are sorted by chromosomal position (Default: False)
--bed_extra_columns BED_EXTRA_COLUMNS [BED_EXTRA_COLUMNS ...]
Number of extra columns in BED file (e.g. in custom ENCODE formats) or list of names for those columns. (Default: 0).
Maximum desired memory footprint in MB to devote to BigBed/BigWig files. May be exceeded by large queries. (Default: 0, No maximum)
--gff_transcript_types GFF_TRANSCRIPT_TYPES [GFF_TRANSCRIPT_TYPES ...]
GFF3 feature types to include as transcripts, even if no exons are present (for GFF3 only; default: use SO v2.5.3 specification)
--gff_exon_types GFF_EXON_TYPES [GFF_EXON_TYPES ...]
GFF3 feature types to include as exons (for GFF3 only; default: use SO v2.5.3 specification)
--gff_cds_types GFF_CDS_TYPES [GFF_CDS_TYPES ...]
GFF3 feature types to include as CDS (for GFF3 only; default: use SO v2.5.3 specification)
Count & alignment file options¶
Open alignment or count files and optionally set mapping rules
--count_files COUNT_FILES [COUNT_FILES ...]
One or more count or alignment file(s) from a single sample or set of samples to be pooled.
Format of file containing alignments or counts (Default: BAM)
Sum used in normalization of counts and RPKM/RPNT calculations (Default: total mapped reads/counts in dataset)
Minimum read length required to be included (BAM & bowtie files only. Default: 25)
Maximum read length permitted to be included (BAM & bowtie files only. Default: 100)
Alignment mapping functions (bam & bowtie files only)¶
For BAM or bowtie files, one of the mutually exclusive read mapping functions is required:
Map read alignment to a variable offset from 5’ position of read, with offset determined by read length. Requires –offset below
Map read alignment to 5’ position.
Map read alignment to 3’ position
Subtract N positions from each end of read, and add 1/(length-N), to each remaining position, where N is specified by –nibble
Filtering and alignment mapping options¶
The remaining arguments are optional and affect the behavior of specific mapping functions:
For –fiveprime or –threeprime, provide an integer representing the offset into the read, starting from either the 5’ or 3’ end, at which data should be plotted. For –fiveprime_variable, provide the filename of a two-column tab-delimited text file, in which first column represents read length or the special keyword ‘default’, and the second column represents the offset from the five prime end of that read length at which the read should be mapped. (Default: 0)
For use with –center only. nt to remove from each end of read before mapping (Default: 0)
File format for figure(s); Default: png)
--figsize N N
Figure width and height, in inches. (Default: use matplotlibrc params)
Base title for plot(s).
Matplotlib color map from which palette will be made (e.g. ‘Blues’,’autumn’,’Set1’; default: use colors from
--stylesheetif given, or color cycle in matplotlibrc)
Figure resolution (Default: 150) Use this matplotlib stylesheet instead of matplotlibrc params
main(argv=['-T', '-E', '-b', 'readthedocs', '-d', '_build/doctrees-readthedocs', '-D', 'language=en', '.', '_build/html'])¶
- argv : list, optional
A list of command-line arguments, which will be processed as if the script were called from the command line if
main()is called directly.
Default: sys.argv[1:]. The command-line arguments, if the script is invoked from the command line
Helper function to extract coding portions from maximal spanning windows flanking CDS starts that are created by
Coding portion of maximal spanning window