plastid.util.scriptlib.argparsers module

This module contains classes that:

  • build argparse.ArgumentParser objects for various data types used in genomics

  • parse those arguments into useful file types

Arguments are grouped into the following sets:

Parameter/argument set

Parser building class

Generic parameters (e.g. for error reporting, logging)

BaseParser

Read alignments or count files

AlignmentParser

Genomic feature or mask annotations

AnnotationParser

Genomic sequence files

SequenceParser

Plotting parameters for charts

PlottingParser

Example

To use any of these in your own command line scripts, follow these steps:

  1. Import one or more of the classes above:

    >>> import argparse
    >>> from plastid.util.scriptlib.argparsers import AnnotationParser
    
  2. Use the first function to create an ArgumentParser, and supply this object as a parent when you build your script’s ArgumentParser:

    >>> ap = AnnotationParser()
    
    # create annotation file parser
    >>> annotation_file_parser = ap.get_parser(disabled=["some_option_to_disable"])
    
    # create my own parser, incorporating flags from annotation_file_parser
    >>>> my_own_parser = argparse.ArgumentParser(parents=[annotation_file_parser])
    
    # add script-specific arguments
    >>> my_own_parer.add_argument("positional_argument",type=str)
    >>> my_own_parser.add_argument("--foo",type=int,default=5,help="Some option")
    >>> my_own_parser.add_argument("--bar",type=str,default="a string",help="Another option")
    
  3. Then, use the second parse the arguments:

    >>> args = parser.parse_args()
    
    # get transcript objects from arguments
    # this will be an iterator over |Transcripts|
    >>> transcripts = ap.get_transcripts_from_args(args)
    
    >>> pass # rest of your script
    

Your script will then be able process whatever sorts of annotation files that plastid currently supports.

See Also

argparse

Python documentation on argument parsing

plastid.bin

Source code of command-line scripts, for further examples

class plastid.util.scriptlib.argparsers.AlignmentParser(prefix='', disabled=None, input_choices=('BAM', 'bigwig', 'bowtie', 'wiggle'), groupname='alignment_options', allow_mapping=True)[source]

Bases: plastid.util.scriptlib.argparsers.Parser

Parser for files containing read alignments or quantitative data.

Checks for additional mapping rules and command-line arguments by checking the entrypoints plastid.mapping_rules and plastid.mapping_options

Parameters
groupnamestr, optional

Name of argument group. If not None, an argument group with the specified name will be created and added to the parser. If not, arguments will be in the main group.

prefixstr, optional

string prefix to add to default argument options (Default: “”)

disabledlist, optional

list of parameter names that should be disabled from parser, without preceding dashes

input_choiceslist, optional

list of permitted alignment file type choices for input

allow_mappingbool, optional

Enable/disable user configuration of mapping rules (default: True)

Methods

get_genome_array_from_args(args[, printer])

Return a GenomeArray, SparseGenomeArray or BAMGenomeArray from arguments parsed by get_alignment_file_parser()

get_parser([title, description])

Return an ArgumentParser that opens alignment (BAM, or bowtie) or count (Wiggle, bedGraph) files.

get_genome_array_from_args(args, printer=None)[source]

Return a GenomeArray, SparseGenomeArray or BAMGenomeArray from arguments parsed by get_alignment_file_parser()

Parameters
argsargparse.Namespace

Arguments from the parser

printerfile-like, optional

A stream to which stderr-like info can be written (default: NullWriter)

Returns
GenomeArray, SparseGenomeArray, or BAMGenomeArray
get_parser(title='count & alignment file options', description='Open alignment or count files and optionally set mapping rules', **kwargs)[source]

Return an ArgumentParser that opens alignment (BAM, or bowtie) or count (Wiggle, bedGraph) files.

In the case of bowtie or BAM import, also parse arguments for mapping rules (e.g. fiveprime end mapping, threeprime end mapping, et c) and optional read length filters

Parameters
groupnamestr, optional

Name of argument group. If not None, an argument group with the specified name will be created and added to the parser. If not, arguments will be in the main group.

titlestr, optional

title for option group (used in command-line help screen)

descriptionstr, optional

description of parser (used in command-line help screen)

kwargskeyword arguments

Additional arguments to pass to Parser.get_parser()

Returns
argparse.ArgumentParser
class plastid.util.scriptlib.argparsers.AnnotationParser(prefix='', disabled=None, groupname='annotation_options', input_choices=('BED', 'BigBed', 'GTF2', 'GFF3'))[source]

Bases: plastid.util.scriptlib.argparsers.Parser

Parser for annotation files in various formats

Parameters
groupnamestr, optional

Name of argument group. If not None, an argument group with the specified name will be created and added to the parser. If not, arguments will be in the main group.

prefixstr, optional

string prefix to add to default argument options (Default: “”)

disabledlist, optional

list of parameter names that should be disabled from parser, without preceding dashes

input_choiceslist, optional

list of permitted alignment file type choices for input

allow_mappingbool, optional

Enable/disable user configuration of mapping rules (default: True)

Methods

get_genome_hash_from_args(args[, printer])

Return a GenomeHash of regions from command-line arguments

get_parser([title, description])

Return an ArgumentParser that opens annotation files.

get_segmentchains_from_args(args[, printer, ...])

Return a generator of SegmentChain objects from arguments parsed by get_annotation_file_parser()

get_transcripts_from_args(args[, printer, ...])

Return a generator of Transcript objects from arguments parsed by get_annotation_file_parser()

get_genome_hash_from_args(args, printer=None)[source]

Return a GenomeHash of regions from command-line arguments

Parameters
argsargparse.Namespace

Namespace object from get_mask_file_parser()

printerfile-like

A stream to which stderr-like info can be written (Default: NullWriter)

Returns
GenomeHash

Hashed data structure of masked genomic regions

See also

get_mask_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

get_parser(title='annotation file options (one or more annotation files required)', description='Open one or more genome annotation files', **kwargs)[source]

Return an ArgumentParser that opens annotation files.

Parameters
titlestr, optional

title for option group (used in command-line help screen)

descriptionstr, optional

description of parser (used in command-line help screen)

kwargskeyword arguments

Additional arguments to pass to Parser.get_parser()

Returns
argparse.ArgumentParser
get_segmentchains_from_args(args, printer=None, return_type=None, require_sort=False)[source]

Return a generator of SegmentChain objects from arguments parsed by get_annotation_file_parser()

Parameters
argsargparse.Namespace

Namespace object from get_annotation_file_parser()

printerfile-like, optional

A stream to which stderr-like info can be written (Default: NullWriter)

return_typeSegmentChain or subclass, optional

Type of object to return (Default: Transcript)

require_sortbool, optional

If True, quit if the annotation file(s) are not sorted or indexed

Returns
iterator

SegmentChain objects, either in order of appearance (if input was a BED, BigBed, or PSL file), or sorted lexically by chromosome, start coordinate, end coordinate, and then strand (if input was GTF2 or GFF3).

See also

get_annotation_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

get_transcripts_from_args(args, printer=None, return_type=None, require_sort=False)[source]

Return a generator of Transcript objects from arguments parsed by get_annotation_file_parser()

Parameters
argsargparse.Namespace

Namespace object from get_annotation_file_parser()

printerfile-like, optional

A stream to which stderr-like info can be written (Default: NullWriter)

return_typeSegmentChain or subclass, optional

Type of object to return (Default: Transcript)

require_sortbool, optional

If True, quit if the annotation file(s) are not sorted or indexed

Returns
iterator

Transcript objects, either in order of appearance (if input was a BED, BigBed, or PSL file), or sorted lexically by chromosome, start coordinate, end coordinate, and then strand (if input was GTF2 or GFF3).

See also

get_annotation_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

class plastid.util.scriptlib.argparsers.BaseParser(groupname='base_options', prefix='', disabled=None)[source]

Bases: plastid.util.scriptlib.argparsers.Parser

Parser basic options

Parameters
groupnamestr, optional

Name of argument group. If not None, an argument group with the specified name will be created and added to the parser. If not, arguments will be in the main group.

prefixstr, optional

string prefix to add to default argument options (Default: “”)

disabledlist, optional

list of parameter names that should be disabled from parser, without preceding dashes

Methods

get_parser([title, description])

Return an ArgumentParser

get_base_ops_from_args

get_base_ops_from_args(args)[source]
get_parser(title=None, description=None)[source]

Return an ArgumentParser

Parameters
titlestr, optional

title for option group (used in command-line help screen)

descriptionstr, optional

description of parser (used in command-line help screen)

Returns
argparse.ArgumentParser
class plastid.util.scriptlib.argparsers.MaskParser(prefix='mask_', disabled=None, groupname='mask_options', input_choices=('BED', 'BigBed', 'GTF2', 'GFF3', 'PSL'))[source]

Bases: plastid.util.scriptlib.argparsers.AnnotationParser

Create a parser for masking genomic features given in an annotation file

Parameters
groupnamestr, optional

Name of argument group. If not None, an argument group with the specified name will be created and added to the parser. If not, arguments will be in the main group.

prefixstr, optional

string prefix to add to default argument options (Default: “”)

disabledlist, optional

list of parameter names that should be disabled from parser, without preceding dashes

input_choiceslist, optional

list of permitted alignment file type choices for input

allow_mappingbool, optional

Enable/disable user configuration of mapping rules (default: True)

Methods

get_genome_hash_from_args(args[, printer])

Return a GenomeHash of regions from command-line arguments

get_parser([title, description])

Return an ArgumentParser that opens annotation files as masks alignment (BAM or bowtie) or count (Wiggle, bedGraph) files.

get_segmentchains_from_args(args[, printer, ...])

Return a generator of SegmentChain objects from arguments parsed by get_annotation_file_parser()

get_transcripts_from_args(args[, printer, ...])

Return a generator of Transcript objects from arguments parsed by get_annotation_file_parser()

get_genome_hash_from_args(args, printer=None)

Return a GenomeHash of regions from command-line arguments

Parameters
argsargparse.Namespace

Namespace object from get_mask_file_parser()

printerfile-like

A stream to which stderr-like info can be written (Default: NullWriter)

Returns
GenomeHash

Hashed data structure of masked genomic regions

See also

get_mask_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

get_parser(title='mask file options (optional)', description='Add mask file(s) that annotate regions that should be excluded from analyses\n(e.g. repetitive genomic regions).', **kwargs)[source]

Return an ArgumentParser that opens annotation files as masks alignment (BAM or bowtie) or count (Wiggle, bedGraph) files.

Parameters
titlestr, optional

title for option group (used in command-line help screen)

descriptionstr, optional

description of parser (used in command-line help screen)

arglistlist, optional

If not None, arguments in this list will be added to parser. Otherwise, arguments will be taken from self.arguments.

The list should be a list of tuples of (‘argument_name’,dict_of_options), where argument_name is a string, and dict_of_options a dictionary of keyword arguments to pass to argparse.ArgumentParser.add_argument().

Returns
argparse.ArgumentParser
get_segmentchains_from_args(args, printer=None, return_type=None, require_sort=False)

Return a generator of SegmentChain objects from arguments parsed by get_annotation_file_parser()

Parameters
argsargparse.Namespace

Namespace object from get_annotation_file_parser()

printerfile-like, optional

A stream to which stderr-like info can be written (Default: NullWriter)

return_typeSegmentChain or subclass, optional

Type of object to return (Default: Transcript)

require_sortbool, optional

If True, quit if the annotation file(s) are not sorted or indexed

Returns
iterator

SegmentChain objects, either in order of appearance (if input was a BED, BigBed, or PSL file), or sorted lexically by chromosome, start coordinate, end coordinate, and then strand (if input was GTF2 or GFF3).

See also

get_annotation_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

get_transcripts_from_args(args, printer=None, return_type=None, require_sort=False)

Return a generator of Transcript objects from arguments parsed by get_annotation_file_parser()

Parameters
argsargparse.Namespace

Namespace object from get_annotation_file_parser()

printerfile-like, optional

A stream to which stderr-like info can be written (Default: NullWriter)

return_typeSegmentChain or subclass, optional

Type of object to return (Default: Transcript)

require_sortbool, optional

If True, quit if the annotation file(s) are not sorted or indexed

Returns
iterator

Transcript objects, either in order of appearance (if input was a BED, BigBed, or PSL file), or sorted lexically by chromosome, start coordinate, end coordinate, and then strand (if input was GTF2 or GFF3).

See also

get_annotation_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

class plastid.util.scriptlib.argparsers.Parser(groupname=None, prefix='', disabled=None, **kwargs)[source]

Bases: object

Base class for argument parser factories used below

Parameters
groupnamestr, optional

Name of argument group. If not None, an argument group with the specified name will be created and added to the parser. If not, arguments will be in the main group.

prefixstr, optional

string prefix to add to default argument options (Default: “”)

disabledlist, optional

list of parameter names that should be disabled from parser, without preceding dashes

Methods

get_parser([parser, groupname, arglist, ...])

Create an populate argparse.ArgumentParser with arguments

get_parser(parser=None, groupname=None, arglist=None, title=None, description=None, **kwargs)[source]

Create an populate argparse.ArgumentParser with arguments

Parameters
parserargparse.ArgumentParser or None, optional

If None, a new parser will be created, and arguments will be added to it. If not None, arguments will be added to parser. (Default: None)

groupnamestr or None, optional

If not None, default to self.groupname. If either groupname or self.groupname is not None, an option group with this name will be added to parser, and arguments added to that groupname instead of the main argument group of parser. In this case, title and description will be applied to the option group instead of to parser. Default : None)

arglistlist, optional

If not None, arguments in this list will be added to parser. Otherwise, arguments will be taken from self.arguments.

The list should be a list of tuples of (‘argument_name’,dict_of_options), where argument_name is a string, and dict_of_options a dictionary of keyword arguments to pass to argparse.ArgumentParser.add_argument().

titlestr, optional

Optional title for parser

descriptionstr, optional

Optional description for parser

kwargskeyword arguments

Additional arguments passed during creation of argparse.ArgumentParser

Returns
argparse.ArgumentParser
class plastid.util.scriptlib.argparsers.PlottingParser(groupname='plotting_options', prefix='', disabled=None)[source]

Bases: plastid.util.scriptlib.argparsers.Parser

Parser for plotting options

Parameters
groupnamestr, optional

Name of argument group. If not None, an argument group with the specified name will be created and added to the parser. If not, arguments will be in the main group.

prefixstr, optional

string prefix to add to default argument options (Default: “”)

disabledlist, optional

list of parameter names that should be disabled from parser, without preceding dashes

Methods

get_colors_from_args(args, num_colors)

Return a list of colors from arguments parsed by a parser from get_plotting_parser()

get_figure_from_args(args, **kwargs)

Return a matplotlib.figure.Figure following arguments from get_plotting_parser()

get_parser([title, description])

Return an ArgumentParser to control plotting

set_style_from_args(args)

Parse style information, if present on system and defined in args

get_colors_from_args(args, num_colors)[source]

Return a list of colors from arguments parsed by a parser from get_plotting_parser()

If a matplotlib colormap is specified in args.figcolors, colors will be generated from that map.

Otherwise, if a stylesheet is specified, colors will be fetched from the stylesheet’s color cycle.

Otherwise, colors will be chosen from the default color cycle specified matplotlibrc.

Parameters
argsargparse.Namespace

Namespace object from get_plotting_parser()

num_colorsint

Number of colors to fetch

Returns
list

List of matplotlib colors

get_figure_from_args(args, **kwargs)[source]

Return a matplotlib.figure.Figure following arguments from get_plotting_parser()

A new figure is created with parameters specified in args. If these are not found, values found in **kwargs will instead be used. If these are not found, we fall back to matplotlibrc values.

Parameters
argsargparse.Namespace

Namespace object from get_plotting_parser()

kwargskeyword arguments

Fallback arguments for items not defined in args, plus any other keyword arguments.

Returns
matplotlib.figure.Figure

Matplotlib figure

get_parser(title='Plotting options', description=None)[source]

Return an ArgumentParser to control plotting

Parameters
titlestr, optional

title for option group (used in command-line help screen)

descriptionstr, optional

description of parser (used in command-line help screen)

Returns
argparse.ArgumentParser
set_style_from_args(args)[source]

Parse style information, if present on system and defined in args

Parameters
argsargparse.Namespace

Namespace object from get_plotting_parser()

class plastid.util.scriptlib.argparsers.PrefixNamespaceWrapper(namespace, prefix)[source]

Bases: object

Wrapper class to facilitate processing of Namespace objects created by get_alignment_file_parser() or get_annotation_file_parser() with non-empty prefix values, as if no prefix had been used.

Attributes
namespaceNamespace

Result of calling argparse.ArgumentParser.parse_args()

prefixstr

Prefix that will be prepended to names of attributes of self.namespace before they are fetched. Must match prefix that was used in creation of the argparse.ArgumentParser that created self.namespace

class plastid.util.scriptlib.argparsers.SequenceParser(groupname='sequence_options', prefix='', disabled=None, input_choices=('fasta', 'fastq', 'twobit', 'genbank', 'embl'))[source]

Bases: plastid.util.scriptlib.argparsers.AnnotationParser

Parser for sequence files

Parameters
groupnamestr, optional

Name of argument group. If not None, an argument group with the specified name will be created and added to the parser. If not, arguments will be in the main group.

prefixstr, optional

string prefix to add to default argument options (Default: “”)

disabledlist, optional

list of parameter names that should be disabled from parser, without preceding dashes

input_choiceslist, optional

list of permitted alignment file type choices for input

Methods

get_genome_hash_from_args(args[, printer])

Return a GenomeHash of regions from command-line arguments

get_parser([title, description])

Return an ArgumentParser that opens sequence files

get_segmentchains_from_args(args[, printer, ...])

Return a generator of SegmentChain objects from arguments parsed by get_annotation_file_parser()

get_seqdict_from_args(args[, index, printer])

Retrieve a dictionary-like object of sequences

get_transcripts_from_args(args[, printer, ...])

Return a generator of Transcript objects from arguments parsed by get_annotation_file_parser()

get_genome_hash_from_args(args, printer=None)

Return a GenomeHash of regions from command-line arguments

Parameters
argsargparse.Namespace

Namespace object from get_mask_file_parser()

printerfile-like

A stream to which stderr-like info can be written (Default: NullWriter)

Returns
GenomeHash

Hashed data structure of masked genomic regions

See also

get_mask_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

get_parser(title='sequence options', description='', **kwargs)[source]

Return an ArgumentParser that opens sequence files

Parameters
titlestr, optional

title for option group (used in command-line help screen)

descriptionstr, optional

description of parser (used in command-line help screen)

kwargskeyword arguments

Additional arguments to pass to Parser.get_parser()

Returns
argparse.ArgumentParser

See also

get_seqdict_from_args

function that parses the Namespace returned by this ArgumentParser

get_segmentchains_from_args(args, printer=None, return_type=None, require_sort=False)

Return a generator of SegmentChain objects from arguments parsed by get_annotation_file_parser()

Parameters
argsargparse.Namespace

Namespace object from get_annotation_file_parser()

printerfile-like, optional

A stream to which stderr-like info can be written (Default: NullWriter)

return_typeSegmentChain or subclass, optional

Type of object to return (Default: Transcript)

require_sortbool, optional

If True, quit if the annotation file(s) are not sorted or indexed

Returns
iterator

SegmentChain objects, either in order of appearance (if input was a BED, BigBed, or PSL file), or sorted lexically by chromosome, start coordinate, end coordinate, and then strand (if input was GTF2 or GFF3).

See also

get_annotation_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

get_seqdict_from_args(args, index=True, printer=None)[source]

Retrieve a dictionary-like object of sequences

Parameters
argsargparse.Namespace

Namespace object from get_sequence_file_parser()

indexbool, optional

If sequence format is anything other than twobit, open with lazily-evaluating Bio.SeqIO.index() instead of Bio.SeqIO.to_dict() (Default: True)

printerfile-like

A stream to which stderr-like info can be written (Default: NullWriter)

Returns
dict-like

Dictionary-like object mapping chromosome names to Bio.SeqRecord.SeqRecord-like objects

get_transcripts_from_args(args, printer=None, return_type=None, require_sort=False)

Return a generator of Transcript objects from arguments parsed by get_annotation_file_parser()

Parameters
argsargparse.Namespace

Namespace object from get_annotation_file_parser()

printerfile-like, optional

A stream to which stderr-like info can be written (Default: NullWriter)

return_typeSegmentChain or subclass, optional

Type of object to return (Default: Transcript)

require_sortbool, optional

If True, quit if the annotation file(s) are not sorted or indexed

Returns
iterator

Transcript objects, either in order of appearance (if input was a BED, BigBed, or PSL file), or sorted lexically by chromosome, start coordinate, end coordinate, and then strand (if input was GTF2 or GFF3).

See also

get_annotation_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

plastid.util.scriptlib.argparsers.get_alignment_file_parser(input_choices=('BAM', 'bigwig', 'bowtie', 'wiggle'), disabled=None, prefix='', title='count & alignment file options', description='Open alignment or count files and optionally set mapping rules', map_desc='For BAM or bowtie files, one of the mutually exclusive read mapping functions\nis required:\n', return_subparsers=False)[source]
plastid.util.scriptlib.argparsers.get_annotation_file_parser(input_choices=['BED', 'BigBed', 'GTF2', 'GFF3'], disabled=[], prefix='', title='annotation file options (one or more annotation files required)', description='Open one or more genome annotation files', return_subparsers=False)[source]

Return an ArgumentParser that opens annotation files from BED, BigBed, GTF2, or GFF3 formats

Parameters
input_choiceslist, optional

list of permitted alignment file type choices. (Default: ‘[“BED”,”BigBed”,”GTF2”,”GFF3”]’). ‘PSL’_ may also be added

disabledlist, optional

list of parameter names that should be disabled from parser without preceding dashes

prefixstr, optional

string prefix to add to default argument options (Default: ‘’)

titlestr, optional

title for option group (used in command-line help screen)

descriptionstr, optional

description of parser (used in command-line help screen)

return_subparsersbool, optional

if True, additionally return a dictionary of subparser option groups, to which additional options may be added (Default: False)

Returns
argparse.ArgumentParser

See also

get_transcripts_from_args

function that parses the Namespace returned by this ArgumentParser

plastid.util.scriptlib.argparsers.get_colors_from_args(args, num_colors)[source]

Return a list of colors from arguments parsed by a parser from get_plotting_parser()

If a matplotlib colormap is specified in args.figcolors, colors will be generated from that map.

Otherwise, if a stylesheet is specified, colors will be fetched from the stylesheet’s color cycle.

Otherwise, colors will be chosen from the default color cycle specified matplotlibrc.

Parameters
argsargparse.Namespace

Namespace object from get_plotting_parser()

num_colorsint

Number of colors to fetch

Returns
list

List of matplotlib colors

plastid.util.scriptlib.argparsers.get_figure_from_args(args, **kwargs)[source]

Return a matplotlib.figure.Figure following arguments from get_plotting_parser()

A new figure is created with parameters specified in args. If these are not found, values found in **kwargs will instead be used. If these are not found, we fall back to matplotlibrc values.

Parameters
argsargparse.Namespace

Namespace object from get_plotting_parser()

kwargskeyword arguments

Fallback arguments for items not defined in args, plus any other keyword arguments.

Returns
matplotlib.figure.Figure

Matplotlib figure

plastid.util.scriptlib.argparsers.get_genome_array_from_args(args, prefix='', disabled=None, printer=None)[source]

Return a GenomeArray, SparseGenomeArray or BAMGenomeArray from arguments parsed by get_alignment_file_parser()

Parameters
argsargparse.Namespace

Namespace object from get_alignment_file_parser()

prefixstr, optional

string prefix to add to default argument options (Default: “”) Must be same prefix that was added in call to get_alignment_file_parser() (Default: “”)

disabledlist, optional

list of parameter names that were disabled when the argparser was created in get_alignment_file_parser(). (Default: [])

printerfile-like, optional

A stream to which stderr-like info can be written (default: NullWriter)

Returns
GenomeArray, SparseGenomeArray, or BAMGenomeArray

See also

get_alignment_file_parser

Function that creates ArgumentParser whose output Namespace is processed by this function

plastid.util.scriptlib.argparsers.get_genome_hash_from_mask_args(args, prefix='mask_', printer=NullWriter())[source]

Return a GenomeHash of regions from command-line arguments

Parameters
argsargparse.Namespace

Namespace object from get_mask_file_parser()

prefixstr, optional

string prefix to add to default argument options. Must be same prefix that was added in call to get_mask_file_parser() (Default: “mask_”)

printerfile-like

A stream to which stderr-like info can be written (Default: NullWriter)

Returns
GenomeHash

Hashed data structure of masked genomic regions

See also

get_mask_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

plastid.util.scriptlib.argparsers.get_mask_file_parser(prefix='mask_', disabled=[])[source]

Create an ArgumentParser to open annotation files that describe regions of the genome to mask from analyses

Parameters
prefixstr, optional

Prefix to add to default argument options (Default: ‘mask_’)

disabledlist, optional

list of parameter names to disable from the mask file parser (Default: []. add_three is always disabled.)

Returns
argparse.ArgumentParser

See also

get_genome_hash_from_mask_args

function that parses the Namespace returned by this ArgumentParser

plastid.util.scriptlib.argparsers.get_plotting_parser(prefix='', disabled=[], title='Plotting options')[source]

Return an ArgumentParser to control plotting

Parameters
disabledlist, optional

list of parameter names that should be disabled from parser without preceding dashes

prefixstr, optional

string prefix to add to default argument options (Default: ‘’)

titlestr, optional

title for option group (used in command-line help screen)

descriptionstr, optional

description of parser (used in command-line help screen)

Returns
argparse.ArgumentParser

See also

get_colors_from_args

parse colors and/or colormaps from this argument parser

plastid.util.scriptlib.argparsers.get_segmentchain_file_parser(input_choices=['BED', 'BigBed', 'GTF2', 'GFF3', 'PSL'], disabled=[], prefix='', title='annotation file options (one or more annotation files required)', description='Open one or more genome annotation files')[source]

Create an ArgumentParser to open annotation files as SegmentChains

Parameters
input_choiceslist, optional

list of permitted alignment file type choices (Default: [“BED”,”BigBed”,”GTF2”,”GFF3”, “PSL”])

disabledlist, optional

list of parameter names that should be disabled from parser without preceding dashes

prefixstr, optional

string prefix to add to default argument options (Default: ‘’)

titlestr, optional

title for option group (used in command-line help screen)

descriptionstr, optional

description of parser (used in command-line help screen)

Returns
argparse.ArgumentParser

See also

get_segmentchains_from_args

function that parses the Namespace returned by this ArgumentParser

plastid.util.scriptlib.argparsers.get_segmentchains_from_args(args, prefix='', disabled=[], printer=NullWriter(), require_sort=False)[source]

Return a list of SegmentChain objects from arguments parsed by an ArgumentParser created by get_segmentchain_file_parser()

Parameters
argsargparse.Namespace

Namespace object from get_segmentchain_file_parser()

prefixstr, optional

string prefix to add to default argument options. Must be same prefix that was added in call to get_segmentchain_file_parser() (Default: “”)

disabledlist, optional

list of parameter names that were disabled when the annotation file parser was created by get_segmentchain_file_parser(). (Default: [])

printerfile-like

A stream to which stderr-like info can be written (Default: NullWriter)

require_sortbool, optional

If True, quit if the annotation file(s) are not sorted or indexed

Returns
iterator

sequence of SegmentChain objects, either in order of appearance (if input was a BED or PSL file), or sorted lexically by chromosome, start coordinate, end coordinate, and then strand (if input was) GTF or GFF

See also

get_segmentchain_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function

plastid.util.scriptlib.argparsers.get_seqdict_from_args(args, index=True, prefix='', printer=NullWriter())[source]

Retrieve a dictionary-like object of sequences

Parameters
argsargparse.Namespace

Namespace object from get_sequence_file_parser()

prefixstr, optional

string prefix to add to default argument options. Must be same prefix that was added in call to get_sequence_file_parser() (Default: “”)

indexbool, optional

If sequence format is anything other than twobit, open with lazily-evaluating Bio.SeqIO.index() instead of Bio.SeqIO.to_dict() (Default: True)

printerfile-like

A stream to which stderr-like info can be written (Default: NullWriter)

Returns
dict-like

Dictionary-like object mapping chromosome names to Bio.SeqRecord.SeqRecord-like objects

plastid.util.scriptlib.argparsers.get_sequence_file_parser(input_choices=('fasta', 'fastq', 'twobit', 'genbank', 'embl'), disabled=(), prefix='', title='sequence options', description='')[source]

Return an ArgumentParser that opens annotation files from BED, BigBed, GTF2, or GFF3 formats

Parameters
input_choiceslist, optional

list of permitted sequence file type choices. (Default: ‘[“FASTA”,”twobit”,”genbank”,”embl”]’).

disabledlist, optional

list of parameter names that should be disabled from parser without preceding dashes

prefixstr, optional

string prefix to add to default argument options (Default: ‘’)

titlestr, optional

title for option group (used in command-line help screen)

descriptionstr, optional

description of parser (used in command-line help screen)

Returns
argparse.ArgumentParser

See also

get_seqdict_from_args

function that parses the Namespace returned by this ArgumentParser

plastid.util.scriptlib.argparsers.get_transcripts_from_args(args, prefix='', disabled=[], printer=NullWriter(), return_type=None, require_sort=False)[source]

Return a list of Transcript objects from arguments parsed by get_annotation_file_parser()

Parameters
argsargparse.Namespace

Namespace object from get_annotation_file_parser()

prefixstr, optional

string prefix to add to default argument options. Must be same prefix that was added in call to get_annotation_file_parser() (Default: ‘’)

disabledlist, optional

list of parameter names that were disabled when the annotation file parser was created by get_annotation_file_parser(). (Default: [])

printerfile-like, optional

A stream to which stderr-like info can be written (Default: NullWriter)

return_typeSegmentChain or subclass, optional

Type of object to return (Default: Transcript)

require_sortbool, optional

If True, quit if the annotation file(s) are not sorted or indexed

Returns
iterator

Transcript objects, either in order of appearance (if input was a BED, BigBed, or PSL file), or sorted lexically by chromosome, start coordinate, end coordinate, and then strand (if input was GTF2 or GFF3).

See also

get_annotation_file_parser

Function that creates argparse.ArgumentParser whose output Namespace is processed by this function