Algorithms

Identification of Structured Signatures and Classifiers (ISSAC) (Sung et al., PLoS Comp. Bio., 2013)
The identification of molecular signatures from blood, saliva, or urine that accurately reflect major pathologies of a unique organ system will be a significant advance in molecular cancer diagnostics.  ISSAC is a machine-learning algorithm that stratifies multiple clinical phenotypes simultaneously based on relative expression of biological features (e.g. genes, proteins).  ISSAC uses a data-driven, hierarchical approach to first organize phenotypes into a global hierarchy, and then learns the corresponding (binary) classifiers on training data.  More specifically, it first constructs a tree-structured hierarchy of disease phenotypes based on agglomerative clustering; and then learns binary classifiers corresponding to the nodes and edges of the classification hierarchy.  These classifiers were based on comparing ranked expression values of gene-pair sets.  The genes appearing in the hierarchy of decision rules can then be accumulated into a panel of biomarkers, which can then direct disease stratification down a classification tree towards a particular phenotype.  Note: code written in MATLAB. 

DOCUMENTATION
DOWNLOAD (example dataset and scripts)

Metabolic Influence between ordered pairs of microbial entities (Sung et al., Nat. Comm., 2017)
In complex, microbial ecosystems, a microbial entity can provide nutrients to another entity via interspecies cross-feeding of metabolic byproducts and/or release of macromolecule degradation products.  This positive impact may potentially promote microbial growth.  In contrast, a microbial entity can limit another entity’s access to nutrients via competition for the same metabolites.  This negative impact may potentially inhibit microbial growth.  Accordingly, we can leverage information from our microbial metabolite transport network (NJS16) to formulate and quantify the net metabolic influence of a given microbial entity on another entity.  This approach allows us to construct a community-scale network of positive and negative metabolic influences between pairs of microbial entities differentially abundant or scarce in a given context, e.g. gut microbiomes of T2D patients vs. non-diabetic controls.  In the accompanying figure (left), we show the microbial metabolic influence network composed of microbial entities associated with a particular patient cohort.  It features the interplay of positive and negative metabolic influences among 125 microbial entities.  Note: code written in C++.

DOCUMENTATION
DOWNLOAD (example dataset and scripts for Linux users)
DOWNLOAD (example dataset and scripts for Windows users)


Datasets

Microbial Metabolite Transport Network (NJS16) (Sung et al., Nat. Comm., 2017)
To provide a global framework for understanding community metabolism within the human gut, we present NJS16, the first literature-curated, community-level network of the human gut microbiota organized through metabolite transport.  The network is a compilation of 4,483 annotated transport or degradation reactions (from about 400 research articles, reviews, and textbooks) between 244 metabolic compounds (229 small molecules and 15 macromolecules) and 570 microbial species and human cell types (511 bacteria, 56 archaea, and 3 host cells).  Specifically, our network shows how individual microbes interact with their chemical environment (via metabolite import, export, and macromolecule degradation), and thereby with other microbes (via resource competition, interspecies cross-feeding, and releasing macromolecule degradation products as public goods).

DOCUMENTATION
DOWNLOAD (NJS16 in .xlsx, .txt, & .xml formats )