plastid.readers.gff module¶
Tools for reading, writing, analyzing, and manipulating GFF file subtypes (e.g. GTF2 and GFF3).
Summary¶
Because GTF2/GFF3 files are hierarchically structured – i.e. a complex feature can be assembled from several component features; each component feature having its own record on its own line – two interfaces for reading GTF2/GFF3 files are included:
- Assembly of transcripts from exon, CDS, & UTR annotations
GTF2_TranscriptAssembler
andGFF3_TranscriptAssembler
collect individual exon and CDS features, and assemble these intoTranscripts
.Features are read from GTF2/GFF3 files, grouped by transcript_id, Parent, or ID attributes, depending on file type. Assembled
Transcripts
are yielded only when their component features have fully been collected.- Low-level parsing of simple features
GTF2_Reader
andGFF3_Reader
read raw features (such as individual exons, stop codons, SNPs, et c) from GTF2/GFF3 files. Each line is returned as aSegmentChain
.
Module contents¶
|
Read raw features in GTF2 files as |
|
Assemble |
|
Read raw features in GFF3 files as |
|
Assemble |
Examples¶
GTF2_Reader
and GFF3_Reader
return raw, unmodified features from GTF2 or
GFF3 files – e.g. exons, coding regions, stop codons – without assembling
them into transcripts:
>>> feature_reader = GTF2_Reader("some_file.gtf")
>>> for feature in reader:
>>> print(feature.get_name(),feature.attr["type"],str(feature))
('YAL030W_mRNA', 'exon', 'chrI:87262-87387(+)')
('YAL030W_mRNA', 'exon', 'chrI:87500-87857(+)')
('YAL030W_mRNA', 'CDS', 'chrI:87285-87387(+)')
('YAL030W_mRNA', 'CDS', 'chrI:87500-87749(+)')
('YAL030W_mRNA', 'start_codon', 'chrI:87285-87288(+)')
('YAL030W_mRNA', 'stop_codon', 'chrI:87749-87752(+)')
('YBL092W_mRNA', 'exon', 'chrII:45643-45644(+)')
('YBL092W_mRNA', 'exon', 'chrII:45977-46440(+)')
('YBL092W_mRNA', 'CDS', 'chrII:45977-46367(+)')
('YBL092W_mRNA', 'start_codon', 'chrII:45977-45980(+)')
[rest of output omitted]
In contrast, GTF2_TranscriptAssembler
and GFF3_TranscriptAssembler
reconstruct
transcripts from their components, based upon their transcript_id, ID, or
Parent attributes. Note how all features are of type mRNA, and how some
contain multiple exons (coordinates separated by ‘^’):
>>> transcript_reader = GTF2_TranscriptAssembler("some_file.gtf")
>>> for transcript in reader:
>>> print(transcript.get_name(),transcript.attr["type"],str(transcript))
('YAL030W_mRNA', 'mRNA', 'chrI:87262-87387^87500-87857(+)')
('YBL092W_mRNA', 'mRNA', 'chrII:45643-45644^45977-46440(+)')
('YBL057C_mRNA', 'mRNA', 'chrII:112749-113427^113444-113450(-)')
('YBL040C_mRNA', 'mRNA', 'chrII:142033-142749^142846-142891(-)')
('YBL018C_mRNA', 'mRNA', 'chrII:185961-186352^186427-186504(-)')
('YBR012W-B', 'mRNA', 'chrII:259868-261173^261174-265140(+)')
('YBR044C_mRNA', 'mRNA', 'chrII:324292-324336^324340-326127(-)')
('YBR082C_mRNA', 'mRNA', 'chrII:406506-407027^407122-407379(-)')
('YBR126W-B_mRNA', 'mRNA', 'chrII:490824-491202(+)')
('YBR138C_mRNA', 'mRNA', 'chrII:513636-515391(-)')
[rest of output omitted]
See Also¶
- GFF3 specification
GFF3 specification by the Sequence Ontology consortium
- GTF2.2 specification
Hosted by the Brent lab
- UCSC file format FAQ.
GFF & GTF descriptions at UCSC
- class plastid.readers.gff.GFF3_Reader(*streams, end_included=True, return_stopfeatures=False, is_sorted=False, tabix=False)[source]¶
Bases:
plastid.readers.gff.AbstractGFF_Reader
Read raw features in GFF3 files as
SegmentChains
.Users who wish to reconstruct
Transcripts
from raw features should instead useGFF3_TranscriptAssembler
, which performs this task automatically.Assumes input stream to use 1-indexed coordinates, in compliance with the Sequence Ontology GFF3 specification.
GFF3 attributes (from column 9) for each record are stored in its
attr
dictionary. Names and values of attributes are unescaped. The values for the attributes Parent, Alias, Dbxref, dbxref, and Note, if present, are lists rather than strings, because the GFF3 spec enables these to have multiple values.- Parameters
- *streamsone or more str or file-like
One or more input streams or filenames pointing to GFF information
- end_includedbool, optional
Boolean, whether the end coordinate is included in the feature (closed or ‘end-included’ intervals) or not (half-open intervals). All coordinates will be normalized to 0-indexed, half-open (Default: True)
- return_stopfeaturesbool, optional
If True, return a special
SegmentChain
calledStopFeature
signifying that all previously emitted GFF entries may be assembled into complete entities. These are emitted when the line “###” is encountered in a GFF3. (Default: False)- is_sortedbool, optional
If True and return_stopfeatures is True, assume the GFF3 is sorted. The reader will return
StopFeature
when the chromosome name of a given feature differs from that of the previous feature. (Default: False)- tabixboolean, optional
streams point to tabix-compressed files or are open
tabix_file_iterator
(Default: False)
Examples
Read raw features from a GFF3 file:
>>> feature_reader = GFF3_Reader(open("./some_file.gff")) >>> for feature in feature_reader: >>> print(feature.get_name(), feature.attr["type"], str(feature)) ('chrI', 'chromosome', 'chrI:0-230218(.)') ('TEL01L-TR', 'telomeric_repeat', 'chrI:0-62(-)') ('TEL01L', 'telomere', 'chrI:0-801(-)') ('TEL01L-XR', 'X_element_combinatorial_repeat', 'chrI:62-336(-)') ('YAL069W', 'gene', 'chrI:334-649(+)') ('TEL01L-XC', 'X_element', 'chrI:336-801(-)') ('TEL01L-XC_nucleotide_match', 'nucleotide_match', 'chrI:752-763(-)') ('TEL01L-XC_binding_site', 'binding_site', 'chrI:531-544(-)') ('YAL068W-A', 'gene', 'chrI:537-792(+)') ('ARS102', 'ARS', 'chrI:649-1791(.)') [rest of output omitted]
- Attributes
- metadatadict
Dictionary of metadata found in file headers
Methods
close
()Close stream
fileno
()Returns underlying file descriptor if one exists.
filter
(line)Parses lines of the GFF stream into
SegmentChain
When metadata is found, temporarily delegates processing to_parse_metatokens()
, and then reads the next genomic featureflush
(/)Flush write buffers, if applicable.
isatty
()Return whether this is an 'interactive' stream.
read
()Similar to
file.read()
.readable
()Return whether object was opened for reading.
readline
()Process a single line of data, assuming it is string-like
next(self)
is more likely to behave as expected.Similar to
file.readlines()
.Change stream position.
seekable
()Return whether object supports random access.
tell
(/)Return current stream position.
Truncate file to size bytes.
writable
()Return whether object was opened for writing.
writelines
(lines, /)Write a list of lines to stream.
next
- close()¶
Close stream
- fileno()¶
Returns underlying file descriptor if one exists.
OSError is raised if the IO object does not use a file descriptor.
- filter(line)¶
Parses lines of the GFF stream into
SegmentChain
When metadata is found, temporarily delegates processing to_parse_metatokens()
, and then reads the next genomic feature- Parameters
- line
Next line from GFF stream
- Returns
SegmentChain
Next feature in file
- flush(/)¶
Flush write buffers, if applicable.
This is not implemented for read-only and non-blocking streams.
- isatty()¶
Return whether this is an ‘interactive’ stream.
Return False if it can’t be determined.
- next()¶
- read()¶
Similar to
file.read()
. Process all units of data, assuming it is string-like- Returns
- str
- readable()¶
Return whether object was opened for reading.
If False, read() will raise OSError.
- readline()¶
Process a single line of data, assuming it is string-like
next(self)
is more likely to behave as expected.- Returns
- object
a unit of processed data
- readlines()¶
Similar to
file.readlines()
.- Returns
- list
processed data
- seek()¶
Change stream position.
Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:
0 – start of stream (the default); offset should be zero or positive
1 – current stream position; offset may be negative
2 – end of stream; offset is usually negative
Return the new absolute position.
- seekable()¶
Return whether object supports random access.
If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().
- tell(/)¶
Return current stream position.
- truncate()¶
Truncate file to size bytes.
File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.
- writable()¶
Return whether object was opened for writing.
If False, write() will raise OSError.
- writelines(lines, /)¶
Write a list of lines to stream.
Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.
- closed¶
- class plastid.readers.gff.GFF3_TranscriptAssembler(*streams, is_sorted=False, return_type=SegmentChain, add_three_for_stop=False, printer=None, tabix=False)[source]¶
Bases:
plastid.readers.gff.AbstractGFF_Assembler
Assemble
Transcripts
from raw features in GFF3 format.Within a chromosome, transcripts are returned in lexical order. Features that do not constitute portions of transcripts (e.g. origins of replication) are ignored. For access to those, read raw features using
GFF3_Reader
.- Parameters
- streamsone or more str or file-like
One or more input streams or filenames pointing to GFF3 data
- is_sortedbool, optional
GFF3 is sorted by chromosome name, allowing some memory savings (Default: False)
- return_type
SegmentChain
or subclass, optional Type of feature to return from assembled subfeatures (Default:
SegmentChain
)- add_three_for_stopbool, optional
Some annotation files exclude the stop codon from CDS annotations. If set to True, three nucleotides will be added to the threeprime end of each CDS annotation. (Default: False)
- transcript_typeslist, optional
List of GFF3 feature types that should be considered as transcripts (Default: as specified in SO 2.5.3 )
- exon_typeslist, optional
List of GFF3 feature types that should be considered as exons or contributing to transcript nucleotide positions during transcript assembly (Default: as specified in SO 2.5.3 )
- cds_typeslist, optional
List of GFF3 feature types that should be considered as CDS or contributing to transcript coding regions during transcript assembly (Default: as specified in SO 2.5.3 )
- printerfile-like, optional
Logger implementing a
write()
method. Default:NullWriter
- tabixboolean, optional
streams point to tabix-compressed files or are open
tabix_file_iterator
(Default: False)
Notes
- GFF3 schemas vary
GFF3 files can have many different schemas of hierarchy. We deal with that here by allowing users to supply transcript_types and exon_types, to indicate which sorts of features should be included. By default, we use a subset of the schema set out in Seqence Ontology 2.5.3
Briefly:
1. The GFF3 file is combed for transcripts of the types specified by transcript_types, exons specified by exon_types, and CDS specified by types listed in cds_types.
Exons and CDS are matched with their parent transcripts by matching the Parent attributes of CDS and exons to the ID of transcripts. Transcripts are then constructed from those intervals, and coding regions set accordingly.
If exons and/or CDS features point to a Parent that is not in transcript_types, they are grouped into a new transcript, whose ID is set to the value of their shared Parent. However, this value for Parent might refer to a gene rather than a transcript; unfortunately this cannot be known without other information. Attributes that are common to all CDS and exon features are bubbled up to the transcript.
If exons and/or CDS features have no Parent, but share a common ID, they are grouped by ID into a single transcript. Attributes common to all CDS and exon features are bubbled up to the transcript. The Parent attribute is left unset.
If a transcript feature is annotated but has no child CDS or exons, the transcript is assumed to be non-coding and is assembled from any transcript-type features that share its ID attribute.
- Identity relationships between elements vary between GFF3 files
Different GFF3 files specify discontiguous features differently. For example, in Flybase, different exons of a transcript will have unique IDs, but will share the same ‘Parent’ attribute in column 9 of the GFF. In Wormbase, however, different exons of the same transcript will share the same ID. Here, we first check for the Flybase style (by Parent), then fall back to Wormbase style (by shared ID).
- Transcript assembly
To save memory, transcripts are assembled lazily as follows:
If there exist assembled transcripts in self._transript_cache, return the next transcript. Transcripts in the cache are stored lexically.
Otherwise, collect features from the GFF3 stream until either a ‘###’ line or EOF is encountered. Then, assemble transcripts and store them in self._transcript_cache. Delete unused features from memory. If the GFF3 is sorted, then a change in chromosome name will also trigger assembly of collected features.
Examples
Assemble transcripts from a GFF3 file:
>>> transcript_reader = GFF3_TranscriptAssembler(open("some_file.gff")) >>> for transcript in reader: >>> print(transcript.get_name(),transcript.attr["type"],str(transcript)) # do something ('YAL030W_mRNA', 'mRNA', 'chrI:87262-87387^87500-87857(+)') ('YBL092W_mRNA', 'mRNA', 'chrII:45643-45644^45977-46440(+)') ('YBL057C_mRNA', 'mRNA', 'chrII:112749-113427^113444-113450(-)') ('YBL040C_mRNA', 'mRNA', 'chrII:142033-142749^142846-142891(-)') ('YBL018C_mRNA', 'mRNA', 'chrII:185961-186352^186427-186504(-)') ('YBR012W-B', 'mRNA', 'chrII:259868-261173^261174-265140(+)') ('YBR044C_mRNA', 'mRNA', 'chrII:324292-324336^324340-326127(-)') ('YBR082C_mRNA', 'mRNA', 'chrII:406506-407027^407122-407379(-)') ('YBR126W-B_mRNA', 'mRNA', 'chrII:490824-491202(+)') ('YBR138C_mRNA', 'mRNA', 'chrII:513636-515391(-)') [rest of output omitted]
- Attributes
- streamsfile-like
Input stream, usually constructed from or more open filehandles
- metadatadict
Various attributes gleaned from the stream, if any
- counterint
Cumulative line number counter over all streams
- printerfile-like, optional
Logger implementing a
write()
method.- rejectedlist
A list of transcript IDs from transcripts that failed to assemble properly
Methods
close
()Close stream
fileno
()Returns underlying file descriptor if one exists.
filter
(data)Return next assembled feature from self.stream
flush
(/)Flush write buffers, if applicable.
isatty
()Return whether this is an 'interactive' stream.
read
()Similar to
file.read()
.readable
()Return whether object was opened for reading.
readline
()Process a single line of data, assuming it is string-like
next(self)
is more likely to behave as expected.Similar to
file.readlines()
.Change stream position.
seekable
()Return whether object supports random access.
tell
(/)Return current stream position.
Truncate file to size bytes.
writable
()Return whether object was opened for writing.
writelines
(lines, /)Write a list of lines to stream.
next
- close()¶
Close stream
- fileno()¶
Returns underlying file descriptor if one exists.
OSError is raised if the IO object does not use a file descriptor.
- filter(data)¶
Return next assembled feature from self.stream
- Returns
SegmentChain
or subclassNext feature assembled from self.streams, type specified by self.return_type
- flush(/)¶
Flush write buffers, if applicable.
This is not implemented for read-only and non-blocking streams.
- isatty()¶
Return whether this is an ‘interactive’ stream.
Return False if it can’t be determined.
- next()¶
- read()¶
Similar to
file.read()
. Process all units of data, assuming it is string-like- Returns
- str
- readable()¶
Return whether object was opened for reading.
If False, read() will raise OSError.
- readline()¶
Process a single line of data, assuming it is string-like
next(self)
is more likely to behave as expected.- Returns
- object
a unit of processed data
- readlines()¶
Similar to
file.readlines()
.- Returns
- list
processed data
- seek()¶
Change stream position.
Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:
0 – start of stream (the default); offset should be zero or positive
1 – current stream position; offset may be negative
2 – end of stream; offset is usually negative
Return the new absolute position.
- seekable()¶
Return whether object supports random access.
If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().
- tell(/)¶
Return current stream position.
- truncate()¶
Truncate file to size bytes.
File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.
- writable()¶
Return whether object was opened for writing.
If False, write() will raise OSError.
- writelines(lines, /)¶
Write a list of lines to stream.
Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.
- closed¶
- class plastid.readers.gff.GTF2_Reader(*streams, end_included=True, return_stopfeatures=False, is_sorted=False, tabix=False)[source]¶
Bases:
plastid.readers.gff.AbstractGFF_Reader
Read raw features in GTF2 files as
SegmentChains
. To assemble transcripts from raw features, useGTF2_TranscriptAssembler
.Assumes input to comply with the GTF2 specification. Each element must:
use 1-indexed, fully-closed coordinates
have defined gene_id and transcript_id attributes
All
SegmentChain
objects returned by the reader have 0-indexed, half-open coordinates in keeping with Python conventions.- Parameters
- *streamsone or more str or file-like
One or more input streams or filenames pointing to GFF information
- end_includedbool, optional
Boolean, whether the end coordinate is included in the feature (closed or ‘end-included’ intervals) or not (half-open intervals). (Default: True)
- return_stopfeaturesbool, optional
If True, will return a special
SegmentChain
calledStopFeature
signifying that all previously emitted SegmentChains may be assembled into complete entities. These are emitted when the line “###” is encountered in a GTF2. (Default: False)- is_sortedbool, optional
If True and return_stopfeatures is True, assume the GTF2 is sorted by chromosome. The reader will return
StopFeature
when the chromosome name of a given feature differs from that of the previous feature. (Default: False)- tabixboolean, optional
streams point to tabix-compressed files or are open
tabix_file_iterator
(Default: False)
Examples
Read raw features from a GTF2 file:
>>> feature_reader = GTF2_Reader(open("some_file.gtf")) >>> for feature in reader: >>> print(feature.get_name(),feature.attr["type"],str(feature)) ('YAL030W_mRNA', 'exon', 'chrI:87262-87387(+)') ('YAL030W_mRNA', 'exon', 'chrI:87500-87857(+)') ('YAL030W_mRNA', 'CDS', 'chrI:87285-87387(+)') ('YAL030W_mRNA', 'CDS', 'chrI:87500-87749(+)') ('YAL030W_mRNA', 'start_codon', 'chrI:87285-87288(+)') ('YAL030W_mRNA', 'stop_codon', 'chrI:87749-87752(+)') ('YBL092W_mRNA', 'exon', 'chrII:45643-45644(+)') ('YBL092W_mRNA', 'exon', 'chrII:45977-46440(+)') ('YBL092W_mRNA', 'CDS', 'chrII:45977-46367(+)') ('YBL092W_mRNA', 'start_codon', 'chrII:45977-45980(+)') [rest of output omitted]
- Attributes
- metadatadict
Dictionary of metadata found in file headers
Methods
close
()Close stream
fileno
()Returns underlying file descriptor if one exists.
filter
(line)Parses lines of the GFF stream into
SegmentChain
When metadata is found, temporarily delegates processing to_parse_metatokens()
, and then reads the next genomic featureflush
(/)Flush write buffers, if applicable.
isatty
()Return whether this is an 'interactive' stream.
read
()Similar to
file.read()
.readable
()Return whether object was opened for reading.
readline
()Process a single line of data, assuming it is string-like
next(self)
is more likely to behave as expected.Similar to
file.readlines()
.Change stream position.
seekable
()Return whether object supports random access.
tell
(/)Return current stream position.
Truncate file to size bytes.
writable
()Return whether object was opened for writing.
writelines
(lines, /)Write a list of lines to stream.
next
- close()¶
Close stream
- fileno()¶
Returns underlying file descriptor if one exists.
OSError is raised if the IO object does not use a file descriptor.
- filter(line)¶
Parses lines of the GFF stream into
SegmentChain
When metadata is found, temporarily delegates processing to_parse_metatokens()
, and then reads the next genomic feature- Parameters
- line
Next line from GFF stream
- Returns
SegmentChain
Next feature in file
- flush(/)¶
Flush write buffers, if applicable.
This is not implemented for read-only and non-blocking streams.
- isatty()¶
Return whether this is an ‘interactive’ stream.
Return False if it can’t be determined.
- next()¶
- read()¶
Similar to
file.read()
. Process all units of data, assuming it is string-like- Returns
- str
- readable()¶
Return whether object was opened for reading.
If False, read() will raise OSError.
- readline()¶
Process a single line of data, assuming it is string-like
next(self)
is more likely to behave as expected.- Returns
- object
a unit of processed data
- readlines()¶
Similar to
file.readlines()
.- Returns
- list
processed data
- seek()¶
Change stream position.
Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:
0 – start of stream (the default); offset should be zero or positive
1 – current stream position; offset may be negative
2 – end of stream; offset is usually negative
Return the new absolute position.
- seekable()¶
Return whether object supports random access.
If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().
- tell(/)¶
Return current stream position.
- truncate()¶
Truncate file to size bytes.
File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.
- writable()¶
Return whether object was opened for writing.
If False, write() will raise OSError.
- writelines(lines, /)¶
Write a list of lines to stream.
Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.
- closed¶
- class plastid.readers.gff.GTF2_TranscriptAssembler(*streams, is_sorted=False, return_type=SegmentChain, add_three_for_stop=False, printer=None, tabix=False)[source]¶
Bases:
plastid.readers.gff.AbstractGFF_Assembler
Assemble
Transcripts
from raw features in GTF2 format.Exons and CDS features are grouped by shared
transcript_id
. Attributes that have common values for all exons and CDS within a transcript are propagated up to the attr dict of the assembledTranscript
. Other attributes from individual CDS or exon components are discarded.The assembler functions as an iterator. Within each chromosome, transcripts are returned in lexical order.
For access to raw features, instead use
GTF2_Reader
.- Parameters
- *streamsone or more str or file-like
One or more input streams or filenames pointing to GTF2 data
- is_sortedbool, optional
GTF2 is sorted by chromosome name, allowing some memory savings (Default: False)
- return_type
SegmentChain
or subclass, optional Type of feature to return from assembled subfeatures (Default:
SegmentChain
)- add_three_for_stopbool, optional
Some annotation files exclude the stop codon from CDS annotations. If set to True, three nucleotides will be added to the threeprime end of each CDS annotation, UNLESS the annotated transcript contains explicit stop_codon feature. (Default: False)
- printerfile-like, optional
Logger implementing a
write()
method. Default:NullWriter
- tabixboolean, optional
streams point to tabix-compressed files or are open
tabix_file_iterator
(Default: False)
Examples
Assemble transcripts from a GTF2 file:
>>> transcript_reader = GTF2_TranscriptAssembler(open("some_file.gtf")) >>> for transcript in reader: >>> print(transcript.get_name(),transcript.attr["type"],str(transcript)) # do something ('YAL030W_mRNA', 'mRNA', 'chrI:87262-87387^87500-87857(+)') ('YBL092W_mRNA', 'mRNA', 'chrII:45643-45644^45977-46440(+)') ('YBL057C_mRNA', 'mRNA', 'chrII:112749-113427^113444-113450(-)') ('YBL040C_mRNA', 'mRNA', 'chrII:142033-142749^142846-142891(-)') ('YBL018C_mRNA', 'mRNA', 'chrII:185961-186352^186427-186504(-)') ('YBR012W-B', 'mRNA', 'chrII:259868-261173^261174-265140(+)') ('YBR044C_mRNA', 'mRNA', 'chrII:324292-324336^324340-326127(-)') ('YBR082C_mRNA', 'mRNA', 'chrII:406506-407027^407122-407379(-)') ('YBR126W-B_mRNA', 'mRNA', 'chrII:490824-491202(+)') ('YBR138C_mRNA', 'mRNA', 'chrII:513636-515391(-)') [rest of output omitted]
- Attributes
- streamsfile-like
Input streams, usually constructed from one or more open filehandles
- metadatadict
Various attributes gleaned from the streams, if any
- counterint
Cumulative line number counter over all streams
- printerfile-like, optional
Logger implementing a
write()
method.- rejectedlist
A list of transcript IDs from transcripts that failed to assemble properly
Methods
close
()Close stream
fileno
()Returns underlying file descriptor if one exists.
filter
(data)Return next assembled feature from self.stream
flush
(/)Flush write buffers, if applicable.
isatty
()Return whether this is an 'interactive' stream.
read
()Similar to
file.read()
.readable
()Return whether object was opened for reading.
readline
()Process a single line of data, assuming it is string-like
next(self)
is more likely to behave as expected.Similar to
file.readlines()
.Change stream position.
seekable
()Return whether object supports random access.
tell
(/)Return current stream position.
Truncate file to size bytes.
writable
()Return whether object was opened for writing.
writelines
(lines, /)Write a list of lines to stream.
next
- close()¶
Close stream
- fileno()¶
Returns underlying file descriptor if one exists.
OSError is raised if the IO object does not use a file descriptor.
- filter(data)¶
Return next assembled feature from self.stream
- Returns
SegmentChain
or subclassNext feature assembled from self.streams, type specified by self.return_type
- flush(/)¶
Flush write buffers, if applicable.
This is not implemented for read-only and non-blocking streams.
- isatty()¶
Return whether this is an ‘interactive’ stream.
Return False if it can’t be determined.
- next()¶
- read()¶
Similar to
file.read()
. Process all units of data, assuming it is string-like- Returns
- str
- readable()¶
Return whether object was opened for reading.
If False, read() will raise OSError.
- readline()¶
Process a single line of data, assuming it is string-like
next(self)
is more likely to behave as expected.- Returns
- object
a unit of processed data
- readlines()¶
Similar to
file.readlines()
.- Returns
- list
processed data
- seek()¶
Change stream position.
Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:
0 – start of stream (the default); offset should be zero or positive
1 – current stream position; offset may be negative
2 – end of stream; offset is usually negative
Return the new absolute position.
- seekable()¶
Return whether object supports random access.
If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().
- tell(/)¶
Return current stream position.
- truncate()¶
Truncate file to size bytes.
File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.
- writable()¶
Return whether object was opened for writing.
If False, write() will raise OSError.
- writelines(lines, /)¶
Write a list of lines to stream.
Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.
- closed¶
- dtmp = {'CDS_like': {}, 'exon_like': {}}¶
- plastid.readers.gff.StopFeature = <SegmentChain segments=1 bounds=Stop:0-1(.) name=StopFeature>¶
Special
SegmentChain
emitted from GFF readers when:the special line
###
is encounteredthe special line
###FASTA
is encountereda GFF file is marked as sorted, and the contig/chromosome changes
the source stream of features is changed
indicating that all previously returned features may be assembled into full objects.
Note
Because
StopFeature
is zero-length, it does not evaluate as equal to itself. Usex is StopFeature
orx is not StopFeature
it testing for equality.