Extending plastid¶
Plastid defines the following entry points to enable users to write plug-in functions that can be accessed from the command line:
Entry point
Used for
plastid.mapping_rules
Adding new mapping rules to
plastid
’s command-line scripts.
plastid.mapping_options
Adding command-line options used by new mapping rules.
For plastid
to discover your plug-in, your plug-in must be registered
with the system. Registration requires some packaging, which isn’t too painful.
Packaging is discussed in the following sections:
Setting up the folder¶
First, create a folder with structure similar to the following:
my_project/
setup.py
my_project/
__init__.py
map_rules.py
Adjust the filenames to suit your project.
Writing command-line options for mapping rules¶
Mapping rules¶
We assume you have written mapping rules as described in
Writing new mapping functions. plastid
needs some metadata
to use them. This is specified in a dictionary that defines at least
bamfunc or bowtiefunc. All of the remaining keys are optional:
Key
Value type
Value
name
str
Overrides the command-line name of mapping rule defined in
setup.py
. I.e. - the flag--name
will be the command-line argument that invokes the ruleMust not contain spaces, dashes, or special characters. Underscores are o
bamfunc
Function
Mapping function for alignments in BAM format
bowtiefunc
Function
Mapping function for alignments in bowtie format
help
str
Command-line help for the mapping function. Should describe what the function does, and which command-line arguments affect its behavior (e.g.
--offset
,--nibble
or something added in Additional parameters for mapping rules)
If bowtiefunc or bamfunc are unspecified or set to None,
plastid
will assume the mapping function is not implemented
for the corresponding type. Typically, users would only write
a function for mapping BAM files.
We’ll suppose that all of our functions are specified in my_project/map_rules.py
as described above. The contents of map_rules.py
might then look something like this:
#!/usr/bin/env python
def rule1_for_bowtie_files(alignment,args=None):
# calculate position(s) where a single aliignment maps
# and the value to place at each position
#
# the parsed command-line arguments will be passed
# as an argparse.Namespace object
...
return position_value_tuples
def rule1_for_BAM_files(alignments,segment,args=args):
# calculate positions where a list of alignments map,
# and a vector of values at each position
#
# again, args is an argparse.Namespace object
# from the command-line args
...
return reads_out, count_array
def rule2_for_BAM_files_only(alignments,segment,args=args):
# calculate positions where a list of alignments map,
# and a vector of values at each position
...
# do something with a command-line argument
my_option = args.new_option
if my_option == "":
pass
return reads_out, count_array
rule1_info = {
"name" : 'rule1',
"bamfunc" : rule1_for_BAM_files,
"bowtiefunc" : rule1_for_bowtie_files,
"help" : "Some help text for rule 1."
}
rule2_info = {
"name" : 'rule2',
"bamfunc" : rule2_for_BAM_files_only,
"help" : "Some help text. Rule 2's behavior is modified by the option `--new_option`"
}
rule1 is defined for both BAM and bowtie files. rule2 is defined
only for BAM files, and it uses the command-line option --new_option
,
which we define below in Additional parameters for mapping rules.
Additional parameters for mapping rules¶
Additional command-line parameters are also specified as dictionaries.
In these, the keys and values can be any valid parameters for
argparse.ArgumentParser.add_argument()
. Each dictionary should
additionally define a key called name, whose value will be used as
the name of the command-line argument. For example, we might add
the following lines to my_project/map_rules.py
:
param1 = {
"name" : "new_option",
"type" : int,
"nargs" : 2,
"help" : "Some help text for --new_option",
"metavar" : "N",
}
That’s it!
Writing setup.py
¶
Having written the mapping functions and made dictionaries describing them,
we need to write package metadata so that plastid
can find the new
functions. All of this information goes into setup.py
.
setup.py
should everything needed to set up and install your package.
For more information see the documentation for setuptools
and / or
distutils
. setup.py
should minimally contain the following:
#/usr/bin/env python
from setuptools import setup, find_packages
# list all the rules we want to include
# syntax is:
#
# rule_name = path.to.rule:rule_info_dictionary"
#
#
rules = [
"rule1 = my_project.rules:rule1_info",
"rule2 = my_project.rules:rule2_info",
]
# list any extra arguments we want to include
# syntax is:
#
# argument_name = path.to.rule:arg_info_dictionary"
#
#
rule_options = [
"new_option = my_project.rules:param1",
]
setup(
# root level name of package
name = "my_project",
# tell setup() that `rules` and `rule_options` specify mapping
# ruls and arguments for plastid:
entry_points = {
"plastid.mapping_rules" : rules,
"plastid.mapping_options" : rule_options,
},
setup_requires = ['plastid>=0.4.4'],
packages = find_packages(),
# plus any other arguments (e.g. package author, description)
# to ``setup``.
)
That’s the last piece.
Installing the new mapping rules¶
Installation is the final step. Enter the folder containing setup.py
.
Then, to install your new mapping rules, type:
$ python setup.py install [--user]
Or, if you plan to keep developing your mapping rules,
and want plastid
to be aware of these changes instantly:
$ python setup.py develop --user
To test your installation, check command-line help from a script that uses
mapping rules (e.g. make_wiggle
):
$ make-wiggle --help
If the installation proceeded correctly you should see something like this:
# rest of command line help above
alignment mapping options (BAM & bowtie files only):
For BAM or bowtie files, one of the mutually exclusive read mapping choices
is required:
--fiveprime_variable Map read alignment to a variable offset from 5'
position of read, with offset determined by read
length. Requires `--offset` below
--fiveprime Map read alignment to 5' position.
--threeprime Map read alignment to 3' position
--center Subtract N positions from each end of read, and add
1/(length-N), to each remaining position, where N is
specified by `--nibble`
--rule2 Some help text. Rule 2's behavior is modified by the
option `--new_option`
--rule1 Some help text for rule 1.
The remaining arguments are optional and affect the behavior of specific
mapping rules:
--offset OFFSET For `--fiveprime` or `--threeprime`, provide an
integer representing the offset into the read,
starting from either the 5' or 3' end, at which data
should be plotted. For `--fiveprime_variable`, provide
the filename of a two-column tab-delimited text file,
in which first column represents read length or the
special keyword `'default'`, and the second column
represents the offset from the five prime end of that
read length at which the read should be mapped.
--nibble N For use with `--center` only. nt to remove from each
end of read before mapping (Default: 0)
--new_option N N Some help text for --new_option
# remaining command-line help below
If the new mapping rule and command-line arguments are listed, you are ready.
See also¶
Read mapping functions for information on how to write mapping rules
argparse
documentation for information on command-line argumentsDocumentation for
setuptools
anddistutils
for more information on packaging