plastid.readers.common module

Constants, functions, and classes used by multiple readers in this subpackage

Functions & classes

get_identical_attributes()

Return a dictionary of all key-value pairs that are common to all attr dictionaries in a set of SegmentChains

AssembledFeatureReader

Base class for readers that assemble high-level features (e.g. gapped alignments or transcripts) from one or more sub-features in an annotation file

class plastid.readers.common.AssembledFeatureReader(*streams, return_type=SegmentChain, add_three_for_stop=False, tabix=False, printer=None, **kwargs)[source]

Bases: plastid.util.io.filters.AbstractReader

Abstract base class for readers that yield complex or discontinuous features such as transcripts or gapped alignments.

For memory efficiency, all readers function as iterators. Readers built by subclassing AssembledFeatureReader are responsible for:

  • choosing when to yield assembled features

  • deciding how many subfeatures to hold in memory

  • overloading _assemble()

Parameters
*streamsfile-like

One or more open filehandles of input data.

return_typeSegmentChain or subclass, optional

Type of feature to return from assembled subfeatures (Default: SegmentChain)

add_three_for_stopbool, optional

Some annotation files exclude the stop codon from CDS annotations. If set to True, three nucleotides will be added to the threeprime end of each CDS annotation, UNLESS the annotated transcript contains explicit stop_codon feature. (Default: False)

printerfile-like, optional

Logger implementing a write() method. Default: NullWriter

tabixboolean, optional

streams are tabix-compressed (Default: False)

**kwargs

Other keyword arguments used by specific parsers

Attributes
streamsfile-like

Input streams, usually constructed from or more open filehandles

metadatadict

Various attributes gleaned from the stream, if any

counterint

Cumulative line number counter over all streams

printerfile-like, optional

Logger implementing a write() method.

return_typeclass

The type of object assembled by the reader. Typically an SegmentChain or a subclass thereof.

rejectedlist

A list of transcript IDs that failed to assemble properly

Methods

close()

Close stream

fileno()

Returns underlying file descriptor if one exists.

filter(data)

Return next assembled feature from self.stream

flush(/)

Flush write buffers, if applicable.

isatty()

Return whether this is an 'interactive' stream.

read()

Similar to file.read().

readable()

Return whether object was opened for reading.

readline()

Process a single line of data, assuming it is string-like next(self) is more likely to behave as expected.

readlines()

Similar to file.readlines().

seek

Change stream position.

seekable()

Return whether object supports random access.

tell(/)

Return current stream position.

truncate

Truncate file to size bytes.

writable()

Return whether object was opened for writing.

writelines(lines, /)

Write a list of lines to stream.

next

close()

Close stream

fileno()

Returns underlying file descriptor if one exists.

OSError is raised if the IO object does not use a file descriptor.

filter(data)[source]

Return next assembled feature from self.stream

Returns
SegmentChain or subclass

Next feature assembled from self.streams, type specified by self.return_type

flush(/)

Flush write buffers, if applicable.

This is not implemented for read-only and non-blocking streams.

isatty()

Return whether this is an ‘interactive’ stream.

Return False if it can’t be determined.

next()
read()

Similar to file.read(). Process all units of data, assuming it is string-like

Returns
str
readable()

Return whether object was opened for reading.

If False, read() will raise OSError.

readline()

Process a single line of data, assuming it is string-like next(self) is more likely to behave as expected.

Returns
object

a unit of processed data

readlines()

Similar to file.readlines().

Returns
list

processed data

seek()

Change stream position.

Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:

  • 0 – start of stream (the default); offset should be zero or positive

  • 1 – current stream position; offset may be negative

  • 2 – end of stream; offset is usually negative

Return the new absolute position.

seekable()

Return whether object supports random access.

If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().

tell(/)

Return current stream position.

truncate()

Truncate file to size bytes.

File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.

writable()

Return whether object was opened for writing.

If False, write() will raise OSError.

writelines(lines, /)

Write a list of lines to stream.

Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.

closed
plastid.readers.common.get_identical_attributes(features, exclude=None)[source]

Return a dictionary of all key-value pairs that are identical for all SegmentChains in features

Parameters
featureslist

list of SegmentChains

excludeset

attributes to exclude from identity criteria

Returns
dict

Dictionary of all key-value pairs that have identical values in all the attr dictionaries of all the features in features