plastid.util.io.openers module

Various wrappers and utilities for opening, closing, and writing files.

Important methods

argsopener()

Opens a file for writing within a command-line script and writes to it all command-line arguments as a pretty-printed dictionary of metadata, commented out. The open file handle is then returned for subsequent writing

read_pl_table()

Wrapper function to open a table saved by one of plastid’s command-line scripts into a pandas.DataFrame.

opener()

Guesses whether a file is bzipped, gzipped, zipped, or uncompressed based upon file extension, opens it appropriately, and returns a file-like object.

NullWriter()

Returns an open filehandle to the system’s null location.

class plastid.util.io.openers.NullWriter[source]

Bases: plastid.util.io.filters.AbstractWriter

Writes to system-dependent null location. On Unix-like systems & OSX, this is typically /dev/null. On Windows, simply “nul”

Attributes
closed

Methods

close()

flush and close self.stream

fileno()

Returns underlying file descriptor if one exists.

filter(stream)

Method that filters or processes each unit of data.

flush()

Flush self.stream

isatty()

Return whether this is an 'interactive' stream.

readable()

Return whether object was opened for reading.

readline([size])

Read and return a line from the stream.

readlines([hint])

Return a list of lines from the stream.

seek

Change stream position.

seekable()

Return whether object supports random access.

tell(/)

Return current stream position.

truncate

Truncate file to size bytes.

writable()

Return whether object was opened for writing.

write(data)

Write data to self.stream

writelines(lines, /)

Write a list of lines to stream.

close()

flush and close self.stream

fileno()

Returns underlying file descriptor if one exists.

OSError is raised if the IO object does not use a file descriptor.

filter(stream)[source]

Method that filters or processes each unit of data. Override this in subclasses

Parameters
dataunit of data

Whatever data to filter/format. Often string, but not necessarily

Returns
object

formatted data. Often string, but not necessary

flush()

Flush self.stream

isatty()

Return whether this is an ‘interactive’ stream.

Return False if it can’t be determined.

readable()

Return whether object was opened for reading.

If False, read() will raise OSError.

readline(size=- 1, /)

Read and return a line from the stream.

If size is specified, at most size bytes will be read.

The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.

readlines(hint=- 1, /)

Return a list of lines from the stream.

hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.

seek()

Change stream position.

Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:

  • 0 – start of stream (the default); offset should be zero or positive

  • 1 – current stream position; offset may be negative

  • 2 – end of stream; offset is usually negative

Return the new absolute position.

seekable()

Return whether object supports random access.

If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().

tell(/)

Return current stream position.

truncate()

Truncate file to size bytes.

File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.

writable()

Return whether object was opened for writing.

If False, write() will raise OSError.

write(data)

Write data to self.stream

Parameters
dataunit of data

Whatever data to filter/format. Often string, but not necessary

writelines(lines, /)

Write a list of lines to stream.

Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.

closed
plastid.util.io.openers.args_to_comment(namespace)[source]

Formats a argparse.Namespace into a comment block useful for printing in headers of output files

Parameters
namespaceargparse.Namespace

Namespace object returned by argparse.ArgumentParser

Returns
string
plastid.util.io.openers.argsopener(filename, namespace, mode='w', **kwargs)[source]

Open a file for writing, and write to it command-line arguments formatted as a pretty-printed dictionary in comment metadata.

Parameters
filenamestr

Name of file to open. If it terminates in ‘.gz’ or ‘.bz2’ the filehandle will write to a gzipped or bzipped file

namespaceargparse.Namespace

Namespace object from argparse.ArgumentParser

modestr

Mode of writing (‘w’ or ‘wb’)

**kwargs

Other keyword arguments to pass to file opener

Returns
open filehandle
plastid.util.io.openers.get_short_name(inpt, separator='/', terminator='')[source]

Gives the basename of a filename or module name passed as a string. If the string doesn’t match the pattern specified by the separator and terminator, it is returned unchanged.

Parameters
inptstr

Input

terminatorstr

File terminator (default: “”)

Returns
str

Examples

>>> get_short_name("test")
'test'
>>> get_short_name("test.py", terminator=".py")
'test'
>>> get_short_name("/home/jdoe/test.py", terminator=".py")
'test'
>>> get_short_name("/home/jdoe/test.py.py", terminator=".py")
'test.py'
>>> get_short_name("/home/jdoe/test.py.2")
'test.py.2'
>>> get_short_name("/home/jdoe/test.py.2", terminator=".py")
'test.py.2'
>>> get_short_name("plastid.bin.test", separator="\.", terminator="")
'test'
plastid.util.io.openers.multiopen(inp, fn=None, args=None, kwargs=None)[source]

Normalize filename/file-like/list of filename or file-like to a list of appropriate objects

If not list-like, inp is converted to a list. Then, for each element x in inp, if x is file-like, it is yielded. Otherwise, fn is applied to x, and the result yielded.

Parameters
inpstr, file-like, or list-like of either of those

Input describing file(s) to open

fncallable, optional

Callable to apply to input to open it

argstuple, optional

Tuple of positional arguments to pass to fn

kwargskeyword arguments

Arguments to pass to fn

Yields
Object

Result of applying fn to filename(s) in inp

plastid.util.io.openers.opener(filename, mode='r', **kwargs)[source]

Open a file, detecting whether it is compressed or not, based upon its file extension. Extensions are tested in the following order:

File ends with

Presumed to be

gz

gzipped

bz2

bzipped

zip

zipped

anything else

uncompressed

Parameters
filenamestr

Name of file to open

modestr

Mode in which to open file. See Python standard libarary documentation on file opening modes for choices (e.g. “r”, “a, “w” with or without “b”)

**kwargs

Other parameters to pass to appropriate file opener

plastid.util.io.openers.pretty_print_dict(dtmp)[source]

Pretty prints an un-nested dictionary

Parameters
dtmpdict
Returns
str

pretty-printed dictionary

plastid.util.io.openers.read_pl_table(filename, **kwargs)[source]

Open a table saved by one of plastid’s command-line scripts, passing default arguments to pandas.read_table():

Key

Value

sep

” “

comment

“#”

index_col

None

header

0

Parameters
filenamestr

Name of file. Can be gzipped, bzipped, or zipped.

kwargskeyword arguments

Other keyword arguments to pass to pandas.read_table(). Will override defaults.

Returns
pandas.DataFrame

Table of results

plastid.util.io.openers.write_pl_table(df, filename, sep='\t', header=True, index=None, **kwargs)[source]

Wrapper function to write DataFrame df to a tab-delimited table, with header

Parameters
dfDataFrame

DataFrame to save

filenamestr

Name of file to create

**kwargskeyword arguments, optional

Any keyword argument readable by pandas.DataFrame.to_csv().