SeQuence IDentification (SQID) is a database searching algorithm for tandem mass spectrometry developed in Wysocki group. The SQID program and source code are available under the GNU GPL license. |
Usage Open “SQID_1.0.exe”, specify fasta database and dta folders, then click “Index&Rundtalist”. “Run dtalist” button will only be available when an “.index” file is used instead of “.fasta” file. Indexing any database will create an “.index” file in the db folder.Input files Currently SQID only accepts .dta files. The folders DIRECTLY containing .dta files should be used in data field.A newer test version can directly accept Thermo .Raw files (collected from Xcalibur2.0 or lower), can be downloaded here. Note that in addition to .out output option, this version has an .mzid output option instead of .txt file output option. Database SQID accepts (.fasta) database. After indexing the database, a folder, a protein file (.pro) and a (.index) file with the same name as fasta database will be created in the “db” folder. This indexed database can be used repeatedly in future searches by putting the (.index) file in “database” field. Because currently SQID needs to index database for all searches, it does not support super large database like NR. Output files Currently two output options are available: Xcorr= SQID score/5; SQIDscore/5 is almost at the same scale with Sequest Xcorr. Filtering the out file withXcorr and DeltCN (Xcorr>1.8, 2,5, 3.5, DeltCN>0.05) will give a ~5% FDR for SQID. The .out files can be viewed using scaffold, or dtaselect; they can also be converted to pepXML using trans-proteomic pepline (TPP). Note that a “Sequest.params”file may be needed for dtaselect and TPP. This file can be obtained from any Sequest search. More information about dtaselect and a sampleSequest.param file can be downloaded here. Common errors 1. SQID makes use of some system commands in win32. If you see an error message like ” ‘cmd’ is not recognized as an internal or external command” simply go to “indows/system32” folder and copy the “cmd.exe” to the SQID folder containing the “SQID_1.0.exe”. Work in progress 1. Incorporate more input and output file formats. 2. Enable more protease options. 3. Improve database index efficiency. Reference
SQID- XLink (for cross-linking) SQID-XLink is a database searching algorithm specially designed for tandem mass spectrometry based cross-linking study. It automatically searches regular peptides, mono-linked peptides and cross-linked peptides. It utilizes a similar scoring function from SQID. Currently BS2g, BS3 and EDC cross-linkers are supported. The program is freely available under GNU GPL license. © 2011 Wysocki group Download SQID-XLink (for windows) Please read the Usage.html file in the distribution for usage. Please report bugs to lwz@email.arizona.edu Reference: W. Li, H.A. O’Neill, V.H. Wysocki. SQID-XLink: Implementation of An Intensity-Incorporated Algorithm for Cross-linked Peptide Identification.Bioinformatics, 2012, doi:10.1093/bioinformatics/bts442 Spectrum predictor Spectrum predictor is a program to predict ion trap CID fragmentation spectrum with intensities. The program is freely available under GNU GPL license. © 2011 Wysocki group Download Spectrum predictor (for windows) The program is still under testing stage. Please report bugs to lwz@email.arizona.edu PNNL dataset (28311 spectra) PNNL dataset contains 28311 spectra (25% singly charged, 62% doubly charged and 13% triply charged) from unmodified Deinococcus radiodurans and Shewanellaoneidensis peptides collected by the Pacific Northwest National Laboratories (PNNL) on a Thermo LCQ ion trap mass spectrometer. The dataset was used to optimize and test our algorithm. References for PNNL dataset: Hemoglobin data We recently reported the successful de novo sequencing of hemoglobins from nine small mammals native to North America using LC-MS/MS combined with pepNovo. The spectra files as well as pepNovo results for each species can be download from the following links: Microtus Pennsylvanicus Reference: Ünige A. Laskay, Erin J. Kaleta, Inger-Marie E. Vilcins, Sam R. Telford III, Alan G. Barbour, Vicki H. Wysocki. Development of a Host Blood Meal Database: De Novo Sequencing of Hemoglobin from Nine Small Mammals Using Mass Spectrometry , Biological Chemistry, 393, pp. 195–201, 2012. |