$GLOBALS[PAGETITLE] = "Download Datasets";
Here we provide links to download or access the datasets described in our publication:
Peregrin-Alvarez JM, Xiong X, Su C and Parkinson J. (2009) The modular organization of protein interactions in Escherchia coli. PLoS Computational Biology. In Press
Our datasets are provided for public download without restriction. For datasets generated by third party sources, please follow the links for details on use and restrictions.
|Datasets Available Here|
The functional interaction dataset generated through the integration of ten interaction datasets within a Bayesian framework and was found to significantly out perform other the other functional datasets
presented below (see above publication for further details). Scores given are log likelihood scores.
3989 interactions between 1941 proteins
The Hu et al. TAP interaction dataset (AKA 'Core - experimental') was derived from an ongoing series of TAP-tag pulldown experiments. Scores given are 'confidence scores' see Hu et al. reference presented
below. Alternatively users may wish to download the latest TAP datasets - follow the Hu et al. GC dataset link below.
3888 interactions between 918 proteins
The combined interaction dataset (AKA 'Core - experimental') was derived from integrating the Hu et al. and functional datasets. Scores given are log likelihood scores (7. s.f) and confidence scores (3
s.f). Note the two types of scores are not readily comparable, however a log likelihood score of ~0.27 and a confidence socre of ~0.7 are presumed to be of equivalent quality to interactions derived from small
7613 interactions between 2283 proteins
The extended - experimental dataset includes lower quality interactions associated with the Hu et al. TAP dataset, presented on this website
7220 interactions between 2291 proteins
|Third Party Datasets|
The Yellaboina interaction dataset was derived from integrating a variety of genome context datasets|
Yellaboina S, Goyal K and Mande SC (2007) Inferring genome-wide functional linkages in E. coli by combining improved genome context methods: Comparison with high-throughput experimental data. Genome Research. 17, 527-535.
The Hu et al. GC interaction dataset was also derived from integrating a variety of genome context datasets. We used the suggested cut-off of 0.8|
Hu P, Janga SC, Babu M, Díaz-Mejía JJ, Butland G, Yang W, Pogoutse O, Guo X, Phanse S, Wong P, Chandran S, Christopoulos C, Nazarians-Armavil A, Nasseri NK, Musso G, Ali M, Nazemof N, Eroukova V, Golshani A, Paccanaro A, Greenblatt JF, Moreno-Hagelsieb G, Emili A. (2009) Global functional atlas of Escherichia coli encompassing previously uncharacterized proteins. PLoS Biol. 2009 7:e96.
The STRING interaction dataset was derived from integrating a variety of genome context, literature and experimental datasets. We used Version 7.1 of the database, extracted interactions associated with
E. coli strain W3110 and applied the suggested cut-off of 0.7|
von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini, M, Jouffre, N, Huynen, MA and Bork, P (2005) STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res 33: D433.437.