plastid
0.6.0
Welcome!¶
plastid
is a Python library for genomics and sequencing. It includes
tools for exploratory data analysis (EDA)
as well as a handful of scripts that implement common tasks.
plastid
differs from other packages in its design goals. Namely:
its intended audience includes both bench and computational biologists. We tried to make it easy to use, and wrote lots of Tutorials
It is designed for analyses in which data at each position within a gene or transcript are of interest, such as analysis of ribosome profiling data. To this end,
plastid
uses Read mapping functions to extract the biology of interest from read alignments – e.g. in the case of ribosome profiling, a ribosomal P-site, in DMS-seq, sites of nucleotide modification, et c. – and turn these into quantitative data, usually
numpy arrays
of counts at each nucleotide position in a transcript.encapsulates multi-segment features, such as spliced transcripts, as single objects. This facilitates many common tasks, such as converting coordinates between genome and feature-centric spaces.
It separates data from its representation on disk by providing consistent interfaces to many of the various file formats, found in the wild.
It is designed for expansion to new or unknown assays. Frequently, writing a new mapping rule is sufficient to enable all of
plastid
’s tools to interpret data coming from a new assay.
plastid
was written by Joshua Dunn in
Jonathan Weissman’s lab at
UCSF. Versions of it have been used in several publications
([DFB+13, FRJ+15]).
Where to go next¶
Those new to sequencing and/or bioinformatics, and those who are ribosome profiling should start with Getting started, and then continue to the Tour and Tutorials. The
description of command-line scripts
may also be helpful.Advanced users might be more interested in a quick Tour of the primary data structures and the module documentation.