Sage-N Research believes that we are seeing the emergence of Proteomics 2.0 that extends beyond basic protein ID into statistically sound experiments involving quantitation, PTMs, and pathways
The science and art of proteomics is now starting to evolve from a basic technology to informational relevance.
An interesting parallel can be made with the internet revolution, which started with very basic search engines (Lycos, Alta Vista) before evolving into an information-rich, inter-connected web 2.0 experience, supported by the most sensitive and comprehensive search engine (Google) as a foundation.
Sage-N Research believes that proteomics can take over where microarrays leave off, by focusing on protein expression and post-translational modifications.
To make this happen, we need more advanced proteomics analysis than is routinely done in core labs today.
The power of mass spectrometry proteomics as the 'protein microscope' for cell biology lies in its ability to analyse large, million-spectra datasets from complex peptide mixtures.
Thanks to recent technological advances, that capability is now readily available.
The ability to see at a scale that is orders of magnitude greater than previously possible will continue to revolutionise cell biology.
If you are using proteomics in research, there are several reasons why you need to adopt a high-throughput (HT), million-spectra workflow, even if at present you are working mostly with simple gel spots.
The ability to do large dataset experiments becomes increasingly important, especially in today's competitive funding environments.
Since most researchers today have access to either Mascot or Sequest, the quickest way to get started with large dataset analysis methodology (if not the actual throughput) is to either download a free 14-day trial version of Scaffold or the free Trans-Proteomic Pipeline (TPP) tool suite.
Unless you have prior experience with the more specialised TPP, Scaffold is probably the easier path.
Of the two most popular search engines, Sequest should be used for ion trap and Orbitrap spectra, and Mascot for Q-Tof and Maldi-Tof spectra.
If you use the Orbitrap side of the LTQ-Orbitrap for both precursor and fragment ions (ie, higher fragment accuracy at expense of scan rate), then either Sequest or Mascot would work.
And try to use the entire recommended workflow without mixing and matching components -- you'll get higher quality results that way.
For labs that need an automated high-throughput workflow requiring little IT maintenance, the Sorcerer 2 'integrated data appliance' system provides the optimal solution as a convenient analysis instrument that allows you to process large amounts of proteomic data without hassles.
Sage-N Research continues to recommend proteomics researchers to use standardised workflows, in order to minimise inadvertent errors that arise from workflow subtleties.
Unbeknownst to occasional users of proteomic analysis tools, many aspects of proteomic analysis continue to be in a state of flux, including file formats, software libraries, new capabilities, how defaults are handled, etc.
The easiest way to stay out of trouble is to stick with the recommendations from data analysis experts, especially those from the user community and specialized software vendors.
In particular, it continues to recommend that users of high accuracy mass specs (particularly Orbitrap and FT systems), to run searches with precursor mass tolerance of +/-50ppm, even though the theoretical accuracy can be much higher.
One key benefit is that delta-mass is important auxiliary information used by post-search validation tools to help curve-fit noise distributions, compute false positive rate, and tease out weak matches.
Recently, it has found another reason to search with the widened mass tolerance: robustness even with precursor mass calculation variability.
Apparently, it is possible to get different precursor masses from the latest versions of ISB's ReAdW utility vs Thermo's ExtractMSN.
In a dataset provided by the University of Greifswald, precursor masses generated by ReAdW were about 10 to 15ppm higher than by ExtractMSN using common parameters.
In a spot check of five spectra from this dataset with high-scoring top peptides, Sage-N found that the ExtractMSN masses were between 6 to 24 ppm higher than 'actual' (ie, theoretical from the top peptide hit), while ReAdW masses were between 14 to 39ppm higher.
Under recommended search conditions (eg, 50ppm tolerance), the search results were largely identical (18 of 19 proteins reported with 95%+ confidence were the same).
However, the results were obviously different if mass tolerance was significantly tightened.
Recall that ReAdW (pre-installed on Sorcerer) translates the Thermo RAW file into an mzXML file used for TPP downstream analysis, including protein ID and quantitation.
Thermo's ExtractMSN extracts and filters the peaklists from the RAW file specifically for protein ID.
Both software use the same Xcalibur function library.
Sorcerer users who prefer to use ExtractMSN for their protein ID can simply submit their searches as zipped DTA files or as an SRF (requires upcoming Sorcerer PE v3.4 production update) which implicitly uses ExtractMSN, instead of searching the Raw file