A proteomic profiling of laser-microdissected lung adenocarcinoma cells of early lepidic-types

Background In the new pathologic classification of lung adenocarcinoma proposed by IASLC/ATS/ERS in 2011, lepidic type adenocarcinomas are constituted by three subtypes; adenocarcinoma in situ (AIS), minimally invasive adenocarcinoma (MIA) and lepidic predominant invasive adenocarcinoma (LPIA). Although these subtypes are speculated to show sequential progression from preinvasive lesion to invasive lung cancer, changes of protein expressions during these processes have not been fully studied yet. This study aims to glimpse a proteomic view of the early lepidic type lung adenocarcinomas. Methods A total of nine formalin-fixed and paraffin-embedded (FFPE) lepidic type lung adenocarcinoma tissues were selected from our archives, three tissues each in AIS, MIA and LPIA. The tumor and peripheral non-tumor cells in these FFPE tissues were collected with laser microdissection (LMD). Using liquid chromatography-tandem mass spectrometry (MS/MS), protein compositions were compared with respect to the peptide separation profiles among tumors collected from three types of tissues, AIS, MIA and LPIA. Proteins identified were semi-quantified by spectral counting-based or identification-based approach, and statistical evaluation was performed by pairwise G-tests. Results A total of 840 proteins were identified. Spectral counting-based semi-quantitative comparisons of all identified proteins through AIS to LPIA have revealed that the protein expression profile of LPIA was significantly differentiated from other subtypes. 70 proteins including HPX, CTTN, CDH1, EGFR, MUC1 were found as LPIA-type marker candidates, 15 protein candidates for MIA-type marker included CRABP2, LMO7, and RNPEP, and 26 protein candidates for AIS-type marker included LTA4H and SOD2. The STRING gene set enrichment resulted from the protein-protein interaction (PPI) network analysis suggested that AIS was rather associated with pathways of focal adhesion, adherens junction, tight junction, that MIA had a strong association predominantly with pathways of proteoglycans in cancer and with PI3K-Akt. In contrast, LPIA was associated broadly with numerous tumor-progression pathways including ErbB, Ras, Rap1 and HIF-1 signalings. Conclusions The proteomic profiles obtained in this study demonstrated the technical feasibility to elucidate protein candidates differentially expressed in FFPE tissues of LPIA. Our results may provide candidates of disease-oriented proteins which may be related to mechanisms of the early-stage progression of lung adenocarcinoma. Electronic supplementary material The online version of this article (doi:10.1186/s40169-015-0064-3) contains supplementary material, which is available to authorized users.


Background
Lung cancer is the leading cause of cancer-related mortality worldwide [1]. In Japan, annual deaths from lung cancer are increasing and currently approach about 70,000 [2], while in the United States with a recent decreasing trend in mortality, more than 160,000 succumb annually [3]. In an increasing trend worldwide, advances in chest high-resolution computed tomography (HRCT) scanning technology have enabled the localization of small adenocarcinoma nodules [4] at an earlier and potentially more curable stage of development than previously possible [5]. There are 90 million current and ex-smokers in the United States who are at increased risk of lung cancer. The published data from the National Lung Screening Trial (NLST) suggest that yearly screening with low-dose thoracic CT scan in heavy smokers can reduce lung cancer mortality by 20 % and all-cause mortality by 7-% [6].
In 2011, the new pathologic classification of lung adenocarcinoma was proposed by the International Association for the Study of Lung Cancer (IASLC), the American Thoracic Society (ATS) and the European Respiratory Society (ERS) [7]. In the new classification, the concept of adenocarcinoma in situ (AIS) and minimally invasive adenocarcinoma (MIA) were newly introduced and the term bronchioloalveolar carcinoma (BAC) was abolished. Additionally, invasive adenocarcinomas were categorized into 6 subtypes, lepidic, acinar, papillary, micropapillary, solid, and variants, according to the predominant histologic pattern. Both AIS and MIA were defined as tumors ≤ 3 cm in size. AIS is a preinvasive lesion showing pure lepidic growth without invasion. MIA is also lepidic predominant tumor but with ≤ 5 mm invasion. LPIA is an invasive adenocarcinoma showing former nonmucinous BAC pattern with > 5 mm invasion. These 3 lepidic type adenocarcinomas are speculated to show step-wise progression from AIS, MIA, to LPIA. After complete resection of AIS or MIA, usually 100 % of recurrence-free 5-year survival can be obtained [7], while some recurrent cases are found after resection of LPIA [8][9][10]. Since postoperative prognoses between the AIS plus MIA group and LPIA are different, differential protein expressions associated with invasiveness of cancer cells in each subtype should play important roles to determine local recurrences and survivals. However, precise proteomic analyses using individual cells in these early adenocarcinomas have not yet been performed. To the best of our knowledge, this is the first report performing proteomic analysis using micro-dissected early phase lung adenocarcinoma cells.
Recent advancements in shotgun sequencing and quantitative mass spectrometry for protein analyses could make proteomics amenable to clinical biomarker discovery [11,12]. Laser microdissection (LMD) made it possible to collect target cells from a variety of formalin fixed paraffin embedded (FFPE) cancer tissues. This study attempts to capture a proteomic view of LPIA in comparison with other early stage lung adenocarcimomas by utilizing a label-free identification-based (or spectral counting-based) semi-quantitative shotgun proteomics approach following LMD [13][14][15][16][17][18][19].

Group comparisons by Rsc and G-statistics
We used Abacus [20] to select high-scoring proteins using the thresholds of PeptideProphet probability > 0.99 and ProteinProphet probability > 0.9 as described in "MATERIALS and METHODS", resulting in identifying a total of 840 proteins and obtaining their values of raw fold change in log2 (Rsc). For G-test (p < 0.05) [21], the raw SpCs of all patients in each group were pooled, thereby improving the performance of G-test and decreasing false positive rates significantly [15,22]. Next, the values of Rsc that is a measure of fold changes for protein expression levels were calculated as described in "Materials and Methods" using the spectral counts of these proteins.
The full lists of 840 proteins identified were provided as Additional file 1: Table S3. Proteins in LPIA, MIA and AIS identified under SpC total > 2 for a protein were 789, 607, and 544, respectively, and were subjected to gene ontology (GO) analysis by using PANTHER Ver. 10.0 (http://www.pantherdb.org/). Results of (A) biological processes and (B) protein classes are shown in Fig. 1.
Src substrate cortactin (CTTN), epidermal growth factor receptor (EGFR) and mucin-1 (MUC1) expressed in LPIA might reflect its invasiveness with aggressive proliferation. Invasive carcinoma cells degrade and invade through the extracellular matrix (ECM) by invadopodia, where an EGFR-Src-Arg-cortactin pathway is considered to mediate functional maturation of invadopodia [23][24][25]. Overexpression of cortactin protein (CTTN) has been currently considered to be an important biomarker for invasive cancers because of its frequent link to various invasive cancers, including melanoma, colorectal, and glioblastoma [25].

Protein-protein network analysis of expressed proteins
Network analysis of significant proteins is also helpful to understand how they interplay with other key proteins and pathways. In this study the network analysis of protein-protein interaction was performed by utilizing the STRING database version 10 [32,33]. Therein only experiments, databases and text mining were utilized to avoid less confident predicted interactions. The STRING PPI networks were obtained by applying 70 proteins expressed significantly to LPIA (given in Table 1) and shown in Fig. 2 (also given as Additional file 2: Figure S1). The STRING PPI networks obtained for AIS and MIA are provided as Additional file 3: Figure S2 and Additional file 4: Figure S3. Figure 3 illustrates results of the STRING gene set enrichments (GSEs) for LPIA, MIA, and AIS obtained against cancer related KEGG pathways, which were elucidated with their significance rank p < 0.05 after correction by false discovery rate (FDR). All results are provided in Additional file 1: Table S4. Enrichments on AIS indicated the strong association with pathways of focal adhesion (p = 2.69 × 10 −16 ), adherens junction (p = 6.45 × 10 −12 ) and leukocyte transendothelial migration (p = 9.79 × 10 −13 ). MIA was found to be associated with PI3K-Akt signaling (p = 8.25 × 10 −6 ) and predominantly with proteoglycans in cancer (p = 3.99 × 10 −17 ). In contrast, LPIA was associated broadly with numerous cancer-related pathways which included proteoglycans in cancer (p = 5.94 × 10 −5 ), ErbB signaling (p = 2.99 × 10 −3 ), Ras signaling (p = 5.77 × 10 −3 ), Rap1 signaling (p = 9.72 × 10 −4 ), chemokine signaling (p = 7.23 × 10 −5 ), and HIF-1 signaling (p = 5.77 × 10 −3 ). Proteoglycans are known to be important molecular effectors of cell surface and pericellular microenvironments and to have multiple functions in cancer and angiogenesis by interacting with both ligands and receptors that regulate neoplastic growth and neovascularization [34]. Molecules participating in the proteoglycan-related cancer pathway were denoted by red circles in Additional file 3: Figure S2. The ErbB signaling pathway is associated with many cancer pathways. The ErbB family belong epidermal growth factor receptors which play an important role in tumor growth. Over-expression of EGFR occurs around 60 % of non-small cell lung cancer (NSCLC), in which adenocarcinoma has the higher frequency [35]. Hypoxia-inducible factors (HIFs) regulate the transcription of genes that mediate the response to hypoxia (reduced O 2 availability) [36]. It is considered that diverse products of HIF-1 action such as induction of the Met protein, hepatocyte growth factor (HGF), followed by Met receptor activation may result in the poor prognosis attached to hypoxic tumors, which indeed turn out to be more aggressive that their welloxygenerated counterparts. Molecules participating in the ErbB and HIF-1 signaling pathways were denoted by orange and red circles, respectively, in Fig. 2. Numerous clinical data demonstrated that increased levels of HIF-1 proteins consequenced a poor prognosis and increased patient mortality in many different human cancers including NSCLC [37].

Conclusion
Former localized BAC (≦2 cm) lesions have been histologically classified into types A, B and C by Noguchi et al. based on finding of local cancer progression [38]. These lesions, now identified as AIS, MIA or LPIA usually show focal ground-glass opacity (GGO) on chest HRCT (high resolution computed tomography). Generally, AIS shows pure GGO, and representative MIA and LPIA lesions show GGO with some intratumoral areas of collapsed shadow suggesting invasion. There are multiple studies [39,40] describing that limited lung resection including wedge resection or segmentectomy can cure early adenocarcinomas showing pure GGO. From histologic and radiologic points of view, it is hypothesized that preinvasive AIS progresses to the invasive lesions MIA and LPIA sequentially. Postsurgical 5-year recurrence-free survival rates for AIS and MIA are 100 %, while these for LPIA ranged 71.9 to 93.8 % [8][9][10].
The molecular biological background predisposing the worse prognosis of LPIA compared with AIS and MIA may be in part due to the forms of altered protein expressions found in our present study. Proteins appearing in the step from AIS to MIA are probably important at the initial step of microinvasion. As LPIA prepares characteristics of matured lung cancer, it is reasonable that LPIA expresses a variety of proteins associated with cancer invasion. We believe that some of these proteins are candidates for molecular target therapy to suppress local invasion or distant metastases.
In the new adenocarcinoma subtyping, prognoses of solid or micropapillary predominant invasive adenocarcinomas were reported to be apparently worse than these of other subtypes including lepidic type adenocarcinomas [10,41,42]. We imagine that in the future comparative proteomic analyses such as that presented here will contribute to elucidate protein expressions determining malignant grade of various lung adenocarcinoma subtypes, which will further provide important knowledge to understand the carcinogenetic process and tumor lineages of lung adenocarcinomas for the benefit of patients with more efficient diagnosis and treatment of these tumors.

Ethics approval
The study protocol conformed to the principles of the Declaration of Helsinki. All patients were provided with informed consent and the study protocol was approved by Tokyo Medical University Hospital institutional ethics committee.

FFPE tissues and sample preparation
Surgically removed lung tissues were fixed with a buffered formalin solution containing 10 -15-% methanol, and embedded by a conventional method. Archived paraffin blocks of formalin-fixed adenocarcinoma tissues obtained from cases of AIS (n = 3), MIA (n = 3), and Fig. 2 The high-resolution evidence-view of STRING PPI networks obtained on LPIA by using 70 proteins significantly expressed (listed in Table 1), which were generated using default setting in network depth of 50 interactions under medium confidence (0.4) and standard criteria for linkage only with experiments, databases, and textmining LPIA (n = 3), which were retrieved with the approval from Ethical Committee of Tokyo Medical University Hospital (Acception No. 1964). Patients' characteristics are listed in Table 2. Paraffin blocks were cut into 4-μm sections for diagnosis and 10-μm sections for proteomics. The 10-μm sections were stained with only haematoxylin. Three pathologists (M.N., Y.K.) independently confirmed adenocarcinoma subtypes using the 4-μm sections stained with haematoxylin-eosin (HE).

Laser capture and protein solubilization
Cancerous lesions were identified on serial tissue sections stained with hematoxylin and eosin (HE). For proteomic analysis, a 10-μm thick section prepared from the same tissue block was attached onto DIRECTOR™ slides (Onco-PlexDx, Rockville, MD, USA), de-paraffinized twice with xylene for 5-min, rehydrated with graded ethanol solutions and distilled water, and stained by hematoxylin. Those slides were air-dried and subjected to laser microdissection with a Leica LMD6000 (Leica Micro-systems GmbH, Ernst-Leitz-Strasse, Wetzlar, Germany). At least 30,000 cells (8.17 ± 0.03 mm 2 ) per a tissue were collected directly into a 1.5-mL low-binding plastic tube. From individual three types tissues non-cancerous lesions far from tumors were also collect the same numbers of cells as the pseudo-normal (pN) group (n = 3). Representative HEstained images of adenocarcinoma tissues of AIS, MIA and LPIA were shown in Fig. 4 together with examples of

Exploratory liquid chromatography-tandem mass spectrometry
The digested samples were analyzed in triplicate and in random order by liquid chromatography (LC)-tandem mass spectrometry (MS/MS) using reverse-phase liquid chromatography (RP-LC) interfaced with a LTQ-Orbitrap XL hybrid mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) via a closed nano-electrospray device (ADVANCE Captive Spray Source; AMR Inc. Japan). The RP-LC system consisted of Paradigm MS4 (Michrom Bioresources, USA), a peptide Cap-Trap cartridge (0.3 × 5.0 mm) and a capillary separation column (an L-column Micro of 0.1 × 150 mm packed with reverse phase L-C18 gels of 3 μm in diameter and 12-nm pore size, (CERI, Tokyo, Japan)). An autosampler (HTC-PAL, CTC Analytics, Switzerland) loaded an aliquot of samples onto the trap, which then was washed with solvent A (98 % distilled water with 2 % acetonitrile and 0.1 % formic acid) for concentrating peptides on the trap and desalting. Subsequently, the trap was connected in series to the separation column, and the whole columns were developed for 100 min with a linear acetonitrile concentration gradient made from 5 to 35 % solvent B (10 % distilled water and 90 % acetonitrile containing 0.1 % formic acid) at the flow-rate of 300 nL/min.
An LTQ was operated in the data-dependent MS/MS mode to automatically acquire up to three successive MS/ MS scans in the centroid mode. The three most intense precursor ions for these MS/MS scans could be selected from a high-resolution MS spectrum (a survey scan) that an Orbitrap previously acquired during a predefined short time window in the profile mode at the resolution of 30,000 and the lock mass of m/z 536.1654 in the m/z range of 350 to 1500. The sets of acquired high-resolution MS and MS/MS spectra for peptides were converted to single data files and they were merged into Mascot generic format files for database searching.

Protein identification
To extract protein candidates characterizing lepidic type adenocarcinoma from the shotgun proteomic datasets experimentally acquired, consisting of 36 runs (triplicate runs of 12 samples from 4 groups, AIS, MIA, LPIA, and pseudo-normal, N), we have utilized a label-free spectral counting approach for proteomic data analysis on protein identification and semi-quantitative comparison.
Tha mass spectral raw data were analyzed using the onepath method of X! Tandem wth a k-score plugin in Trans-Proteomic pipeline (TPP) [44,45] against the combined protein fasta file from Human-Invitational database (H-InvDB) [46], RefSeq, and UniProtKB/Swiss-Prot appended with reversed decoy sequences. Peptide mass tolerance was 10 ppm, fragment mass tolerance 0.5 Da, and up to two missed cleavages and non-tryptic cleavage at one end of a peptide were allowed. Methionine oxidation is considered as variable modification. The output files from the search engine were converted to the pepXML files and subjected to peptide-spectrum match (PSM) posterior probability calculation with PeptideProphet [47] and then to Protein-Prophet for identification at the protein level in Transproteomic pipeline (TPP) [44,45].

Semi-quantitative comparison
For calculating spectral counts (SpCs) at the protein level, triplicate X!Tandem [48] results for each patient were simultaneously analyzed with PeptideProphet and the single Fold changes of expressed proteins in the base 2 logarithmic scale (R SC ) were calculated using spectral counting as described [15]. Proteins expressed significantly between two groups were chosen so that their R SC satisfy >1 or < −1, which correspond to their fold changes >2 or <0.5, and p-value < 0.05 in G-test [19]. Although G-test does not require replicates, spectral counts for each protein from triplicates were pooled and used for G-statistic calculation using a two-way contingency table arranged in two rows for a target protein and any other proteins, and two columns for cancer groups on an Excel macro. The Yates correction for continuity was applied to the 2 × 2 tables. The correction could enable us to handle the data containing small spectral counts including zero.

Network analysis of protein-protein interactions
Network analysis of protein-protein interactions was carried out by using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database version 10 [32,33], in which nodes are proteins and edges are the predicted functional associations based on primary databases comprising of KEGG and GO, and primary literature. STRING can predict these interactions based on neighbourhood, gene fusion products, homology and similarity of coexpression patterning, experiments, databases and textmining. Network interaction scores for each node are expressed as a joint probability derived from curated databases of experimental information, textmining and computationally predicted by genetic proximity [32]. In this study, STRING networks were calculated under the criteria for linkage only with experiments, databases, and textmining with the default settings -medium confidence score: 0.400, network depth: 0 and up to 50 interactions.

Additional files
Additional file 1: Additional file 2: Figure S1. The high-resolution evidence-view of STRING PPI networks obtained on LPIA by using 70 proteins significantly expressed (listed in Table 1), which were generated using default setting in network depth of 50 interactions under medium confidence (0.4) and standard criteria for linkage only with experiments, databases, and text mining.
Additional file 3: Figure S2.  Table S1), which were generated using default setting in network depth of 50 interactions under medium confidence (0.4) and standard criteria for linkage only with experiments, databases, and text mining.
Additional file 4: Figure S3. The high-resolution evidence-view of STRING PPI networks obtained on MIA by using 15 proteins significantly expressed (listed in Additional file 1: Table S2), which were generated using default setting in network depth of 50 interactions under medium confidence (0.4) and standard criteria for linkage only with experiments, databases, and text mining.