plastid.bin.reformat_transcripts module

Convert transcripts from BED, BigBed, GTF2, GFF3, or PSL format to BED, extended BED, or GTF2 format.

Note

GFF3 schemas vary

Different GFF3s have different schemas of hierarchy. By default, we assume the ontology used by the Sequence Ontology consortium. Users that require a different schema may supply transcript_types and exon_types, to indicate which sorts of features should be included.

Identity relationships between elements vary between GFF3 files

GFF3 files can represent discontiguous features using two strategies. In one strategy, the exons of a transcript have unique IDs, but will share contain the same parent ID in their same Parent attribute in column 9 of the GFF. In another strategy different exons of the same transcript simply share the same ID, and don’t define a Parent. Here, both schemes are accepted, although what happens if they conflict within a single transcript is undefined.

plastid.bin.reformat_transcripts.fix_name(inp, names_used)[source]

Append a number if an autoSql field name is duplicated.

plastid.bin.reformat_transcripts.main(argv=['-T', '-E', '-b', 'html', '-d', '_build/doctrees', '-D', 'language=en', '.', '_build/html'])[source]

Command-line program

Parameters
argvlist, optional

A list of command-line arguments, which will be processed as if the script were called from the command line if main() is called directly.

Default: sys.argv[1:] (actually command-line arguments)