plastid.util.io.openers module

Various wrappers and utilities for opening, closing, and writing files.

Important methods

argsopener()
Opens a file for writing within a command-line script and writes to it all command-line arguments as a pretty-printed dictionary of metadata, commented out. The open file handle is then returned for subsequent writing
read_pl_table()
Wrapper function to open a table saved by one of plastid’s command-line scripts into a pandas.DataFrame.
opener()
Guesses whether a file is bzipped, gzipped, zipped, or uncompressed based upon file extension, opens it appropriately, and returns a file-like object.
NullWriter()
Returns an open filehandle to the system’s null location.
class plastid.util.io.openers.NullWriter[source]

Bases: plastid.util.io.filters.AbstractWriter

Writes to system-dependent null location. On Unix-like systems & OSX, this is typically /dev/null. On Windows, simply “nul”

Attributes:
closed

Methods

close() flush and close self.stream
flush() Flush self.stream
next
readline Read and return a line from the stream.
readlines Return a list of lines from the stream.
seek Change stream position.
tell Return current stream position.
truncate Truncate file to size bytes.
write(data) Write data to self.stream
fileno  
filter  
isatty  
readable  
seekable  
writable  
writelines  
close()

flush and close self.stream

fileno()

Returns underlying file descriptor if one exists.

An IOError is raised if the IO object does not use a file descriptor.

filter(stream)[source]

Method that filters or processes each unit of data. Override this in subclasses

Parameters:
data : unit of data

Whatever data to filter/format. Often string, but not necessarily

Returns:
object

formatted data. Often string, but not necessary

flush()

Flush self.stream

isatty()

Return whether this is an ‘interactive’ stream.

Return False if it can’t be determined.

readable()

Return whether object was opened for reading.

If False, read() will raise IOError.

readline()

Read and return a line from the stream.

If limit is specified, at most limit bytes will be read.

The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.

readlines()

Return a list of lines from the stream.

hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.

seek()

Change stream position.

Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:

  • 0 – start of stream (the default); offset should be zero or positive
  • 1 – current stream position; offset may be negative
  • 2 – end of stream; offset is usually negative

Return the new absolute position.

seekable()

Return whether object supports random access.

If False, seek(), tell() and truncate() will raise IOError. This method may need to do a test seek().

tell()

Return current stream position.

truncate()

Truncate file to size bytes.

File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.

writable()

Return whether object was opened for writing.

If False, read() will raise IOError.

write(data)

Write data to self.stream

Parameters:
data : unit of data

Whatever data to filter/format. Often string, but not necessary

writelines()
closed
next
plastid.util.io.openers.args_to_comment(namespace)[source]

Formats a argparse.Namespace into a comment block useful for printing in headers of output files

Parameters:
namespace : argparse.Namespace

Namespace object returned by argparse.ArgumentParser

Returns:
string
plastid.util.io.openers.argsopener(filename, namespace, mode='w', **kwargs)[source]

Open a file for writing, and write to it command-line arguments formatted as a pretty-printed dictionary in comment metadata.

Parameters:
filename : str

Name of file to open. If it terminates in ‘.gz’ or ‘.bz2’ the filehandle will write to a gzipped or bzipped file

namespace : argparse.Namespace

Namespace object from argparse.ArgumentParser

mode : str

Mode of writing (‘w’ or ‘wb’)

**kwargs

Other keyword arguments to pass to file opener

Returns:
open filehandle
plastid.util.io.openers.get_short_name(inpt, separator='/', terminator='')[source]

Gives the basename of a filename or module name passed as a string. If the string doesn’t match the pattern specified by the separator and terminator, it is returned unchanged.

Parameters:
inpt : str

Input

terminator : str

File terminator (default: “”)

Returns:
str

Examples

>>> get_short_name("test")
'test'
>>> get_short_name("test.py",terminator=".py")
'test'
>>> get_short_name("/home/jdoe/test.py",terminator=".py")
'test'
>>> get_short_name("/home/jdoe/test.py.py",terminator=".py")
'test.py'
>>> get_short_name("/home/jdoe/test.py.2")
'test.py.2'
>>> get_short_name("/home/jdoe/test.py.2",terminator=".py")
'test.py.2'
>>> get_short_name("plastid.bin.test",separator="\.",terminator="")
'test'
plastid.util.io.openers.multiopen(inp, fn=None, args=None, kwargs=None)[source]

Normalize filename/file-like/list of filename or file-like to a list of appropriate objects

If not list-like, inp is converted to a list. Then, for each element x in inp, if x is file-like, it is yielded. Otherwise, fn is applied to x, and the result yielded.

Parameters:
inp : str, file-like, or list-like of either of those

Input describing file(s) to open

fn : callable, optional

Callable to apply to input to open it

args : tuple, optional

Tuple of positional arguments to pass to fn

kwargs : keyword arguments

Arguments to pass to fn

Yields:
Object

Result of applying fn to filename(s) in inp

plastid.util.io.openers.opener(filename, mode='r', **kwargs)[source]

Open a file, detecting whether it is compressed or not, based upon its file extension. Extensions are tested in the following order:

File ends with Presumed to be
gz gzipped
bz2 bzipped
zip zipped
anything else uncompressed
Parameters:
filename : str

Name of file to open

mode : str

Mode in which to open file. See Python standard libarary documentation on file opening modes for choices (e.g. “r”, “a, “w” with or without “b”)

**kwargs

Other parameters to pass to appropriate file opener

plastid.util.io.openers.pretty_print_dict(dtmp)[source]

Pretty prints an un-nested dictionary

Parameters:
dtmp : dict
Returns:
str

pretty-printed dictionary

plastid.util.io.openers.read_pl_table(filename, **kwargs)[source]

Open a table saved by one of plastid’s command-line scripts, passing default arguments to pandas.read_table():

Key Value
sep ” “
comment “#”
index_col None
header 0
Parameters:
filename : str

Name of file. Can be gzipped, bzipped, or zipped.

kwargs : keyword arguments

Other keyword arguments to pass to pandas.read_table(). Will override defaults.

Returns:
:class:`pandas.DataFrame`

Table of results

plastid.util.io.openers.write_pl_table(df, filename, sep='\t', header=True, index=None, **kwargs)[source]

Wrapper function to write DataFrame df to a tab-delimited table, with header

Parameters:
df : DataFrame

DataFrame to save

filename : str

Name of file to create

**kwargs : keyword arguments, optional

Any keyword argument readable by pandas.DataFrame.to_csv().