plastid 0.6.0

Welcome!¶

plastid is a Python library for genomics and sequencing. It includes tools for exploratory data analysis (EDA) as well as a handful of scripts that implement common tasks.

plastid differs from other packages in its design goals. Namely:

its intended audience includes both bench and computational biologists. We tried to make it easy to use, and wrote lots of Tutorials

It is designed for analyses in which data at each position within a gene or transcript are of interest, such as analysis of ribosome profiling data. To this end, plastid

uses Read mapping functions to extract the biology of interest from read alignments – e.g. in the case of ribosome profiling, a ribosomal P-site, in DMS-seq, sites of nucleotide modification, et c. – and turn these into quantitative data, usually numpy arrays of counts at each nucleotide position in a transcript.

encapsulates multi-segment features, such as spliced transcripts, as single objects. This facilitates many common tasks, such as converting coordinates between genome and feature-centric spaces.

It separates data from its representation on disk by providing consistent interfaces to many of the various file formats, found in the wild.

It is designed for expansion to new or unknown assays. Frequently, writing a new mapping rule is sufficient to enable all of plastid’s tools to interpret data coming from a new assay.

plastid was written by Joshua Dunn in Jonathan Weissman’s lab at UCSF. Versions of it have been used in several publications ([DFB+13, FRJ+15]).

Where to go next¶

Those new to sequencing and/or bioinformatics, and those who are ribosome profiling should start with Getting started, and then continue to the Tour and Tutorials. The description of command-line scripts may also be helpful.

Advanced users might be more interested in a quick Tour of the primary data structures and the module documentation.

Site map¶

Index

Module Index