{"id":98,"date":"2012-09-15T22:31:39","date_gmt":"2012-09-15T22:31:39","guid":{"rendered":"http:\/\/research.chemistry.ohio-state.edu\/wysocki\/?page_id=98"},"modified":"2015-01-14T20:04:28","modified_gmt":"2015-01-15T01:04:28","slug":"bioinformatics","status":"publish","type":"page","link":"https:\/\/research.cbc.osu.edu\/wysocki.11\/group-home\/bioinformatics\/","title":{"rendered":"Bioinformatics"},"content":{"rendered":"<table class=\"borderless\" border=\"1\" width=\"582\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td valign=\"top\">\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-118\" src=\"https:\/\/research.cbc.osu.edu\/wysocki.11\/wp-content\/uploads\/2012\/09\/SQID-title.jpg\" alt=\"\" width=\"200\" height=\"50\" \/><\/p>\n<p><strong><em>SeQuence\u00a0IDentification<\/em><\/strong>\u00a0(SQID) is a database searching algorithm for tandem mass spectrometry developed in\u00a0Wysocki\u00a0group. The SQID program and\u00a0<span style=\"color: #000080;\"><a href=\"http:\/\/dl.dropbox.com\/u\/45234038\/SQID_source_code_1.0.zip\" target=\"_blank\"><span style=\"color: #000080;\">source code<\/span><\/a><\/span>\u00a0are available under the GNU GPL license.<\/p>\n<p align=\"center\"><span style=\"color: #000080;\"><a href=\"http:\/\/dl.dropbox.com\/u\/45234038\/SQID_1.0.zip\" target=\"_blank\"><span style=\"color: #000080;\">Download SQID (for windows)<\/span><\/a><\/span><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td><strong>Usage<\/strong><br \/>\nOpen &#8220;SQID_1.0.exe&#8221;,\u00a0specify\u00a0fasta\u00a0database and\u00a0dta\u00a0folders, then click &#8220;Index&amp;Rundtalist&#8221;. &#8220;Run\u00a0dtalist&#8221; button will only be available when an &#8220;.index&#8221; file is used instead of &#8220;.fasta&#8221; file. Indexing any database will create an &#8220;.index&#8221; file in the\u00a0db\u00a0folder.<em>Input files\u00a0<\/em> Currently\u00a0SQID only accepts .dta\u00a0files. The folders DIRECTLY containing .dta\u00a0files should be used in data field.A newer test version can directly accept Thermo .Raw files (collected from\u00a0Xcalibur2.0 or lower), can be\u00a0<a href=\"http:\/\/dl.dropbox.com\/u\/45234038\/SQID_2.0.zip\" target=\"_blank\">downloaded here<\/a>. Note that in addition to .out output\u00a0option,\u00a0this version has an .mzid\u00a0output option instead of .txt file output option.<\/p>\n<p><strong><em>Database<\/em><\/strong>\u00a0 SQID\u00a0accepts (.fasta) database. After indexing the database, a folder, a protein file (.pro) and a (.index) file with the same name as\u00a0fasta\u00a0database will be created in the &#8220;db&#8221; folder. This indexed database can be used repeatedly in future searches by putting the (.index) file in &#8220;database&#8221; field. Because currently SQID needs to index database for all searches, it does not support super large database like NR.<\/p>\n<p><strong><em>Output files\u00a0<\/em><\/strong>\u00a0 Currently two output options are available:<br \/>\n1. A single tab delimited file (.txt), which can be opened directly in Excel. This format only reports top hit for each spectrum, with the SQID score, delta score, intensity score, matched ions and ion pairs. In excel the results can be easily sorted\/filtered according to each column. This format is mainly used for testing purpose.<br \/>\n2. .Out files mimic\u00a0Sequest. This is currently the default format. The .out file will be generated in .dta\u00a0folders. Note that in the .out file:<\/p>\n<p>Xcorr= SQID score\/5;<br \/>\ndeltCN=\u00a0deltSQID\u00a0((top-second)\/top);<br \/>\nsp= intensity score;Number of matched ions = Number of matched ions in SQID;<\/p>\n<p>SQIDscore\/5 is almost at the same scale with\u00a0Sequest\u00a0Xcorr. Filtering the out file withXcorr\u00a0and\u00a0DeltCN\u00a0(Xcorr&gt;1.8, 2,5, 3.5,\u00a0DeltCN&gt;0.05) will give a ~5% FDR for SQID. The .out files can be viewed using scaffold, or\u00a0dtaselect; they can also be converted to\u00a0pepXML\u00a0using trans-proteomic\u00a0pepline\u00a0(TPP). Note that a &#8220;Sequest.params&#8221;file\u00a0may be needed for\u00a0dtaselect\u00a0and TPP. This file can be obtained from any\u00a0Sequest\u00a0search. More information about\u00a0dtaselect\u00a0and a sampleSequest.param\u00a0file can be downloaded\u00a0<a href=\"http:\/\/www.scripps.edu\/chemphys\/cravatt\/protomap\/dtaselect_instructions.html\" target=\"_blank\">here<\/a>.<\/p>\n<p><strong>Common errors<\/strong><\/p>\n<p>1. SQID makes use of some system commands in win32.<\/p>\n<p>If you see an error message like\u00a0&#8221; &#8216;cmd&#8217;\u00a0is not recognized as an internal or external command&#8221; simply go to &#8220;indows\/system32&#8221; folder and copy the &#8220;cmd.exe&#8221; to the SQID folder containing the &#8220;SQID_1.0.exe&#8221;.<\/p>\n<p><strong>Work in progress<\/strong><\/p>\n<p>1. Incorporate more input and output file formats.<\/p>\n<p>2. Enable more protease options.<\/p>\n<p>3. Improve database index efficiency.<\/p>\n<p><strong>Reference<\/strong><br \/>\nW. Li, L.\u00a0Ji, J. Goya, G. Tan &amp; V.H.\u00a0Wysocki, &#8220;SQID: An Intensity-Incorporated Protein Identification Algorithm for Tandem Mass Spectrometry ,&#8221;<em>\u00a0J. Proteome Res\u00a0<\/em><strong>10<\/strong>(4), 1593-1602 (2011).<br \/>\nPlease report bugs to\u00a0<a href=\"mailto:lwz@email.arizona.edu?subject=SQID%20Inquiry:%20\">lwz@email.arizona.edu<\/a><\/p>\n<p>&nbsp;<\/p>\n<div align=\"center\">\n<hr align=\"center\" size=\"2\" width=\"100%\" \/>\n<\/div>\n<p style=\"text-align: center;\">\u00a0<strong>SQID-\u00a0XLink\u00a0(for cross-linking)<\/strong><\/p>\n<p>SQID-XLink\u00a0is a database searching algorithm specially designed for tandem mass spectrometry based cross-linking study. It automatically searches regular peptides, mono-linked peptides and cross-linked peptides. It utilizes a similar scoring function from SQID. Currently BS2g, BS3 and EDC cross-linkers are supported. The program is freely available under GNU GPL license. \u00a9\u00a02011\u00a0Wysocki\u00a0group<\/p>\n<p align=\"center\"><span style=\"color: #000080;\"><a href=\"https:\/\/dl.dropbox.com\/u\/45234038\/SQID_XLink_1.0.zip\" target=\"_blank\"><span style=\"color: #000080;\">Download SQID-XLink\u00a0(for windows)<\/span><\/a><\/span><\/p>\n<p>Please read the Usage.html file in the distribution for usage.<\/p>\n<p>Please report bugs to\u00a0<a href=\"mailto:lwz@email.arizona.edu?subject=SQID%20Inquiry:%20\">lwz@email.arizona.edu<\/a><\/p>\n<p>Reference: W. Li, H.A. O\u2019Neill, V.H. Wysocki. SQID-XLink: Implementation of An Intensity-Incorporated Algorithm for Cross-linked Peptide Identification.<em>Bioinformatics<\/em>, 2012, doi:10.1093\/bioinformatics\/bts442<\/p>\n<div align=\"center\">\n<hr align=\"center\" size=\"2\" width=\"100%\" \/>\n<\/div>\n<p align=\"center\"><strong>Spectrum predictor<\/strong><\/p>\n<p>Spectrum predictor is a program to predict ion trap CID fragmentation spectrum with intensities. The program is freely available under GNU GPL license. \u00a9\u00a02011\u00a0Wysocki group<\/p>\n<p align=\"center\"><span style=\"color: #000080;\"><a href=\"http:\/\/dl.dropbox.com\/u\/45234038\/SpectrumPredictor.exe\" target=\"_blank\"><span style=\"color: #000080;\">Download Spectrum predictor (for windows)<\/span><\/a><\/span><\/p>\n<p>The program is still under testing stage.<\/p>\n<p>Please report bugs to\u00a0<a href=\"mailto:lwz@email.arizona.edu?subject=SQID%20Inquiry:%20\">lwz@email.arizona.edu<\/a><\/p>\n<div align=\"center\">\n<hr align=\"center\" size=\"2\" width=\"100%\" \/>\n<\/div>\n<p align=\"center\"><strong>PNNL dataset (28311 spectra)<\/strong><\/p>\n<p>PNNL dataset contains 28311 spectra (25% singly charged, 62% doubly charged and 13% triply charged) from unmodified\u00a0Deinococcus\u00a0radiodurans\u00a0and\u00a0Shewanellaoneidensis\u00a0peptides collected by the Pacific Northwest National Laboratories (PNNL) on a Thermo LCQ ion trap mass spectrometer. The dataset was used to optimize and test our algorithm.<\/p>\n<p align=\"center\"><span style=\"color: #000080;\"><a href=\"http:\/\/chemistry.osu.edu\/~wysocki.11\/Projects\/28311_dta.zip\" target=\"_blank\"><span style=\"color: #000080;\">Download the dataset<\/span><\/a><\/span><\/p>\n<p>References for PNNL dataset:<br \/>\n1 Lipton, M. S.;\u00a0Pasa-Tolic, L.; Anderson, G. A.; Anderson, D. J.; Auberry, D. L.; Battista, J. R.; Daly, M. J.; Fredrickson, J.; Hixson, K. K.;\u00a0Kostandarithes, H.;Masselon, C.;\u00a0Markillie, L. M.; Moore, R. J.; Romine, M. F.;\u00a0Shen, Y.;\u00a0Stritmatter, E.;Tolic, N.;\u00a0Udseth, H. R.;\u00a0Venkateswaran, A.; Wong, K.; Zhao, R.; Smith, R. D.,Globalanalysis\u00a0of the\u00a0Deinococcus\u00a0radiodurans\u00a0proteome by using accurate mass tags.<em>Proc\u00a0Natl\u00a0Acad\u00a0Sci\u00a0U S A<\/em>, 2002, 99, (17), 11049-11054.<br \/>\n2.\u00a0Kolker, E.;\u00a0Picone, A. F.;\u00a0Galperin, M. Y.; Romine, M. F.; Higdon, R.;\u00a0Makarova, K. S.;\u00a0Kolker, N.; Anderson, G. A.;\u00a0Qiu, X.; Auberry, K. J.;\u00a0Babnigg, G.;\u00a0Beliaev, A. S.;Edlefsen, P.; Elias, D. A.;\u00a0Gorby, Y. A.;\u00a0Holzman, T.;\u00a0Klappenbach, J. A.;Konstantinidis, K. T.; Land, M. L.; Lipton, M. S.; McCue, L.; Monroe, M.;\u00a0Pasa-Tolic, L.;\u00a0Pinchuk, G.;\u00a0Purvine, S.;\u00a0Serres, M. H.;\u00a0Tsapin, S.;\u00a0Zakrajsek, B. A.; Zhu, W.; Zhou, J.; Larimer, F. W.; Lawrence, C. E.; Riley, M.;\u00a0Collart, F. R.; Yates, J. R.; Smith, R. D.;\u00a0Giometti, C. S.;\u00a0Nealson, K. H.; Fredrickson, J.K.;\u00a0Tiedje, J. M., Global profiling of\u00a0Shewanella\u00a0oneidensis\u00a0MR-1: expression of hypothetical genes and improved functional annotations.\u00a0<em>Proc\u00a0Natl\u00a0Acad\u00a0Sci\u00a0U S A<\/em>, 2005, 102, (6), 2099-2104.<\/p>\n<div align=\"center\">\n<hr align=\"center\" size=\"2\" width=\"100%\" \/>\n<\/div>\n<p align=\"center\"><strong>Hemoglobin data<\/strong><\/p>\n<p>We recently reported the successful\u00a0<em>de novo<\/em>\u00a0sequencing of\u00a0hemoglobins\u00a0from nine small mammals native to North America using LC-MS\/MS combined with\u00a0pepNovo. The spectra files as well as\u00a0pepNovo\u00a0results for each species can be download from the following links:<\/p>\n<p><em><span style=\"color: #000080;\"><a href=\"http:\/\/dl.dropbox.com\/u\/53361062\/Microtus%20Pennsylvanicus.zip\" target=\"_blank\"><span style=\"color: #000080;\">Microtus\u00a0Pennsylvanicus<\/span><\/a><\/span><span style=\"color: #000080;\"><br \/>\n<\/span><span style=\"color: #000080;\"><a href=\"http:\/\/dl.dropbox.com\/u\/53361062\/Peromyscus%20californicus.zip\" target=\"_blank\"><span style=\"color: #000080;\">Peromyscus\u00a0californicus<\/span><\/a><\/span><span style=\"color: #000080;\"><br \/>\n<a href=\"http:\/\/dl.dropbox.com\/u\/53361062\/Peromyscus%20Crinitus.zip\" target=\"_blank\"><span style=\"color: #000080;\">Peromyscus\u00a0Crinitus<\/span><\/a><br \/>\n<a href=\"http:\/\/dl.dropbox.com\/u\/53361062\/Sciurus%20Carolinensis.zip\" target=\"_blank\"><span style=\"color: #000080;\">Sciurus\u00a0Carolinensis<\/span><\/a><br \/>\n<a href=\"http:\/\/dl.dropbox.com\/u\/53361062\/Spermophilus%20Beecheyi.zip\" target=\"_blank\"><span style=\"color: #000080;\">Spermophilus\u00a0Beecheyi<\/span><\/a><br \/>\n<a href=\"http:\/\/dl.dropbox.com\/u\/53361062\/Tamias%20Merriami.zip\" target=\"_blank\"><span style=\"color: #000080;\">Tamias\u00a0Merriam<\/span><\/a><br \/>\n<a href=\"http:\/\/dl.dropbox.com\/u\/53361062\/Tamias%20Striatus.zip\" target=\"_blank\"><span style=\"color: #000080;\">Tamias\u00a0Striatus<\/span><\/a><br \/>\n<a href=\"http:\/\/dl.dropbox.com\/u\/53361062\/Tamiasciurus%20hudsonicus.zip\" target=\"_blank\"><span style=\"color: #000080;\">Tamiasciurus\u00a0hudsonicus<\/span><\/a><br \/>\n<a href=\"http:\/\/dl.dropbox.com\/u\/53361062\/Blarina%20Brevicauda.zip\" target=\"_blank\"><span style=\"color: #000080;\">Blarina\u00a0Brevicauda<\/span><\/a><\/span><\/em><\/p>\n<p>Reference: \u00dcnige A. Laskay, Erin J. Kaleta, Inger-Marie E. Vilcins, Sam R. Telford III, Alan G. Barbour, Vicki H. Wysocki. Development of a Host Blood Meal Database: De Novo Sequencing of Hemoglobin from Nine Small Mammals Using Mass Spectrometry ,\u00a0<em>Biological Chemistry<\/em>, 393, pp. 195\u2013201, 2012.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n","protected":false},"excerpt":{"rendered":"<p>SeQuence\u00a0IDentification\u00a0(SQID) is a database searching algorithm for tandem mass spectrometry developed in\u00a0Wysocki\u00a0group. The SQID program and\u00a0source code\u00a0are available under the GNU GPL license. Download SQID (for windows) Usage Open &#8220;SQID_1.0.exe&#8221;,\u00a0specify\u00a0fasta\u00a0database and\u00a0dta\u00a0folders, then click &#8220;Index&amp;Rundtalist&#8221;. &#8220;Run\u00a0dtalist&#8221; button will only be available when an &#8220;.index&#8221; file is used instead of &#8220;.fasta&#8221; file. Indexing any database will create &hellip; <a href=\"https:\/\/research.cbc.osu.edu\/wysocki.11\/group-home\/bioinformatics\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Bioinformatics&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":4,"menu_order":3,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-98","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/research.cbc.osu.edu\/wysocki.11\/wp-json\/wp\/v2\/pages\/98","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/research.cbc.osu.edu\/wysocki.11\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/research.cbc.osu.edu\/wysocki.11\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/research.cbc.osu.edu\/wysocki.11\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/research.cbc.osu.edu\/wysocki.11\/wp-json\/wp\/v2\/comments?post=98"}],"version-history":[{"count":2,"href":"https:\/\/research.cbc.osu.edu\/wysocki.11\/wp-json\/wp\/v2\/pages\/98\/revisions"}],"predecessor-version":[{"id":1264,"href":"https:\/\/research.cbc.osu.edu\/wysocki.11\/wp-json\/wp\/v2\/pages\/98\/revisions\/1264"}],"up":[{"embeddable":true,"href":"https:\/\/research.cbc.osu.edu\/wysocki.11\/wp-json\/wp\/v2\/pages\/4"}],"wp:attachment":[{"href":"https:\/\/research.cbc.osu.edu\/wysocki.11\/wp-json\/wp\/v2\/media?parent=98"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}