Development of Diagnostics and Prognostics for Lung Cancer
Abstract
Lung cancer is a significant public health problem worldwide and has been associated with significant morbidity and mortalities. Prompt diagnosis and treatment is an effective method for reducing the mortality rates associated with the cancer. However, the current cancer diagnostic and prognostic methods have certain limitations including poor sensitivity for peripheral lesions and a higher risk of pneumothorax. Research has shown that molecular methods for biomarkers in the sputum can be used for early and accurate diagnosis and prognosis of lung cancer. Additional molecular biomarkers can also be determined by utilizing gene expression microarrays. This study will aim at exploring the gene expression in lung cancer to identify biomarkers for the cancer that can be used for the diagnosis and prognosis of the condition. The method of the study will be the use of primary and secondary empirical data by gene expression microarrays, artificial neural network, and systems biological approach as well as a search into the ArrayExpress databases. The results of this study are expected to provide lung cancer biomarkers that can be used for early and accurate diagnosis of the cancer as well as its prognosis.
Introduction
Lung cancer is a significant public health problem worldwide with the two types, small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) being present in either tissues or blood. According to Islami et al. (2015), lung cancer has been reported to have killed approximately 1,590,000 people in 2012 and is the current leading cause of deaths due to cancer worldwide. Ridge et al. (2013) also reported that there are approximately six million new cases of lung cancer worldwide with 12.7% of the total world prevalence being diagnosed in 2012. In the United States, Dela Cruz et al. (2011) reported that there are 239,320 new cases of lung cancer causing 161,250 deaths in the country. Research has also established that the condition has a global geographic distribution with a marked regional variation and age-standardized incidence rate of more than 60 fold in men compared to 30 fold in women (Ridge et al. 2013). Therefore, lung cancer is the most common cancer leading cause of cancer deaths for men globally but has been reported to be the fourth most commonly diagnosed cancer in women but the second most common cause of cancer death (Dela Cruz et al. 2011). The increased prevalence and mortality rates of lung cancer have been associated with increased tobacco smoking especially among men (Schwartz & Cote 2016; Hecht 2012).
Prompt diagnosis and treatment is an effective method for reducing the mortality rates associated with the cancer. According to Rivera et al. (2013) methods such as sputum cytology, conventional and flexible bronchoscopy, radial endobronchial ultrasound (R-EBUS)-guided lung biopsy, electromagnetic navigation (EMN) bronchoscopy, transthoracic needle aspiration (TTNA) or biopsy, pleural fluid cytology, and pleural biopsy. Sputum cytology has been reported as an acceptable method for the diagnosis of lung cancer and has a pooled sensitivity of 66% and a specificity rate of 99% (Jiang et al. 2011; Rivera et al. 2013). However, Rivera et al. (2013) reports that the sensitivity of the sputum cytology can vary depending on the site of the lung cancer. Conventional, flexible and EMN bronchoscopy has different sensitivities and specificities in the diagnosis of lung cancer, but flexible bronchoscopy has an overall sensitivity of 88% (Rivera et al. 2013). However, the sensitivity and the diagnostic yield of the bronchoscopy is lower for peripheral lesions with reports indicating that peripheral lesions with a diameter of less or more than 2 cm showing a sensitivity of only 34% and 63% respectively (Rivera et al. 2013).
R-EBUS is an emerging technology for the diagnosis of lung cancer and has been reported to have a 73% diagnostic yield. The technique uses a probe with an ultrasound transducer to provide 360o radial image of the surrounding structures. This probe is inserted into the working channel of the bronchoscope then advanced into different segments of the targeted lung lobe to obtain a characteristic image of the lobe. A meta-analysis by Steinfort et al. (2011) established that R-EBUS have a pooled sensitivity of 73% and a specificity of 100% for the diagnosis of lung cancer. However, the study also reported that the diagnostic yield of the technique is lower for lesions that are small in size compared to those that are large in size (78% for lesions more than 20 mm in size vs. 56% for lesions less than 20 mm in size). Additionally, the specificity and sensitivity of the technique was reported to be positively influenced by the prevalence of the cancer in the studies patients. Nevertheless, R-EBUS is generally safe having been associated with only 1% risk of pneumothorax (Steinfort et al. 2011).
TTNA plays a significant role in the diagnosis and treatment of many thoracic conditions including lung cancer. According to Birchard (2011), TTNA is a less invasive surgical procedure that can be used for the diagnosis of lung cancer under the guidance of computed tomography or ultrasound. A research by Rivera et al. (2013) established that the pooled sensitivity of the technique for the diagnosis of lung cancer was 90%. However, the sensitivity of the technique decreases for lesions that are lower than 2 cm in diameter. In addition, the technique is not safe because it is associated with a higher rate of pneumothorax when compared to the bronchoscopy procedures.
Pleural fluid cytology is a method of choice for patients with a malignant pleural effusion. In lung cancer, pleural metastases are very common in the visceral pleura and becomes focal when the parietal pleura is involved (Rivera et al. 2013). Therefore, pleural fluid cytology is considered a more sensitive diagnostic procedure compared to other diagnostic techniques. Bielsa et al. (2008) established that the pleural fluid cytology has a sensitivity of 48.5%, but this increased when a second pleural fluid specimen are examined. Indeed, Rivera et al. (2013) report that the sensitivity of this technique when at least two pleural fluid specimens are submitted ranges from 49% to 91%. However, the definitive diagnosis of metastatic lung cancer to the pleural space cannot be obtained by this technique. Therefore, a pleural biopsy is used to provide the definite diagnosis. The diagnostic yield of a pleural biopsy has been reported to range from 75% to 88% with thoracoscopic biopsy of the pleura having 95% to 97% diagnostic yield (Rivera et al. 2013). However, the pleural biopsy is a more invasive procedure and may predispose the patient to the risk of pneumothorax.
All these diagnostic methods have their advantages and disadvantages. Sputum cytology is a less invasive diagnostic method and has a good sensitivity, but the sensitivity varies depending on the site of the lung cancer. While bronchoscopy has a high sensitivity for endobronchial lung cancer, the method has a poor sensitivity for peripheral lesions and cannot be used for the diagnosis of such lung cancers. TTNA has an excellent sensitivity for the diagnosis of malignant diseases such as lung cancer but has a higher risk of pneumothorax than the bronchoscopy techniques. Emerging technologies such as R-EBUS have an excellent specificity (100%) and a high sensitivity (73%) and safe, but have a lower diagnosis yield for lesions that are small in size, and its sensitivity is influenced by the prevalence of the cancer in the studies patients. Although pleural fluid cytology has a high sensitivity especially when two pleural fluid specimens are submitted, it cannot be used to provide a definite diagnosis of metastatic lung cancer to the pleural space. Therefore, researchers recommend adequate tissue acquisition for histological examination, the use of biomarkers and molecular diagnosis.
Different biomarkers have been reported to be sensitive and precise in the diagnosis of lung cancer. For instance, Li et al. (2012) used electrochemiluminescence immunization to detect the biomarkers expressed in 530 patients with lung cancer in comparison to 229 healthy patients. The study established that carcinoembryonic antigen, cytokeratin 19, neuron-specific enolase, carbohydrate antigen-125 and carbohydrate antigen-125 were present in high numbers in patients with pathologically confirmed lung cancer as compared to the healthy individuals. The findings of this study suggests that these biomarkers can be used for accurate diagnosis of lung cancer. However, the detection of these biomarkers by other methods other than molecular methods have a low sensitivity (Li et al. 2012). Therefore, molecular methods such as polymerase chain reactions (PCR) can be applied for the diagnosis of lung cancer by detecting DNA mutations in the sputum of a patient. According to Hubers et al. (2013), the analysis of DNA mutations in sputum is a highly sensitive, simple, rapid and low-cost method for the diagnosis of cancer. The analysis targets the identifications of mutated genes in the sputum. Research has shown that mutations in the oncogene KRAS and tumour-suppressor gene p53 are associated with lung carcinogenesis (Hanahan & Weinberg 2011). Hubers et al. (2013) records that mutations in the KRAS gene have been reported in more than 50% of lung cancer cases. A study by Shigematsu et al. (2005) further found that the KRAS mutations occur mostly in adenocarcinomas with 10% in Eastern countries while 20% to 30% in Western countries. Similarly, Destro et al. (2004) confirmed that KRAS mutations are present in 79% of the sputum samples of patients with lung cancer, but absent in those without the cancer. Recently, Marchetti et al. (2009) found that KRAS mutations are present in 19% of lung cancer cases when detected by direct sequencing while detection by mutant-enriched sequencing could detect 36% of the lung cancer cases. Mutations in p53 gene has also been reported to be present in more than 70% in small-cell lung cancer and approximately 50% of non-small cell lung cancer (Toyooka et al. 2003). Similarly, Wang et al. (2001) also detected p53 gene mutations in 55.5% of patients with lung cancer but only 1.75% of the patients with pulmonary benign disease (control) were found to have the mutations. Additionally, Petitjean et al. (2007) reported a correlation between mutational hotspots of the p53 in lung cancer and the hotspots of adducts formation by the polycyclic aromatic hydrocarbons in tobacco smoke which has been shown to be a major risk factor of lung cancer. These findings suggests that mutations in the p53 gene are associated with lung cancer.
Therefore, the use of molecular methods to detect KRAS and p53 gene mutations can be used as a non-invasive method for the diagnosis of lung cancer. Already, Point-EXACCT and peptic nucleic acid–PCR–restriction fragment length polymorphism (PNA–PCR–RFLP) have been developed for the molecular analysis of sputum for KRAS mutations (Hubers et al. 2013). DNA Microarray Technology and other gene expression profiling molecular techniques can be used for the detection of mutations in the p53 gene mutations (Yang 2009). These methods allows for early non-invasive diagnosis of lung cancer for effective treatment. Indeed, research has shown that KRAS mutations can be detected in sputum one year before the clinical diagnosis of lung cancer (Hubers et al. 2013).
In addition to the diagnosis of lung cancer, the use of these molecular methods can be used for the prognosis of lung cancer. Although several lung cancer prognostic factors such as smoking, tumour cell differentiation and dietary supplements can be used for the prognosis of lung cancer, these methods often provide inaccurate prognosis (Yang 2009). Therefore, more accurate factors such as the detection of mutations in KRAS and p53 gene can serve as better and accurate prognosis of the cancer. Research has also demonstrated the utilization of molecular markers for the prognosis of lung cancer. For instance, Marchetti et al. (2009) found that KRAS mutations are significantly associated with resistance to tyrosine kinase inhibitors therapy for lung cancer. The study also found that the KRAS mutations affected the progression-free survival and overall survival. These findings suggest that the detection of KRAS mutations by molecular methods can be used for the prognosis of lung cancer with regards to survival and effectiveness of therapy. Similarly, a literature reviews by Mogi and Kuwano (2011), and Campling and El-Deiry (2003) established that non-small-cell lung cancers with mutations in p53 gene are relatively more resistant to radiotherapy and chemotherapy, therefore have a worse prognosis. These findings also suggests that the detection of the p53 mutations during the diagnosis of lung cancer can also be used for the prognosis of the cancer to radiotherapy and chemotherapy.
In summary, all the non-molecular methods have certain limitations in the diagnosis and prognosis of lung cancer as already discussed. However, molecular techniques for the detection of mutations in KRAS and p53 gene in sputum have been shown by research studies to be associated with accurate early diagnosis of lung cancer as well as accurate prognosis of the cancer to therapy and survival. Despite these findings, no molecular methods have been developed for the diagnosis and prognosis of lung cancer. Additional molecular biomarkers can also be explored by investigating gene expression in lung cancer. Therefore, this study will aim at exploring the gene expression in lung cancer to identify biomarkers for lung cancer that can be used for the diagnosis and prognosis of the condition.
Aims and Objectives
This study will aim at exploring the gene expression in lung cancer to identify biomarkers for lung cancer that can be used for the diagnosis and prognosis of the condition. The specific objectives of the study will include:
- To identify the molecular biomarkers of lung cancer using gene expression arrays, search in the ArrayExpress databases and artificial neural networks.
- To develop the diagnostics and prognostics of lung cancer using the molecular biomarkers
Experimental Design and Methodology
The study will use both empirical primary and secondary data to estbalh the molecular biomarkers for the diagnosis of lung cancer. Primary data will be generated by gene expression arrays (Lancashire et al. 2010) while secondary data will be searched in the ArrayExpress databases (ArrayExpress 2016) and artificial neural networks (Krogh 2008). Additionally, a systems biological approach using the artificial neural networks will be used for the generation of primary data where it will be utilized in investigating the biological interactions between the identified markers so as to identify their genes and biological functions. These methods are diuscused in details below.
Data Sources
The data for this study will be retrieved from the ArrayExpress database of microarray gene expressions. The database is a software platform with the data for all microarrays (ArrayExpress 2016). The keywords; “lung”, “cancer”, “diagnosis” and “prognosis” will be used during the search.
Gene Expression Arrays
According to Lancashire et al. (2010), gene expression microarrays is a method that can be used for a high throughput analysis of a large number of gene transcripts. The technology has been widely used in the biological and molecular classification and in providing a prognosis of the clinical outcome of different cancers. The gene expression microarrays are effective in evaluating the genes that are expressed in cells and understanding the interaction between a large number of genes. Previous studies have reported the effectiveness of gene expression arrays in the diagnosis and expression of cancers such as breast cancer (Kumar et al. 2012) and leukemia (Song et al. 2006). This study will use the gene expression array methodology described by Macgregor and Squire (2002) to provide information regarding the several genes that determine the biomarkers of lung cancer.
Artificial Neural Networks and Systems Biology Approach
Artificial neural networks are different models that are inspired by biological neural networks such as the brain and used to estimate particular functions that depend on large inputs (Krogh 2008). The artificial neural networks have been used for the classification of different diseases and identification of biomarkers for disease including cancer because of its ability to cope and be integrated with complex datasets like those developed by gene microarray experiments (Lancashire et al. 2008). Indeed, previous studies have demonstrated the application of the artificial neural networks to the diagnosis of colorectal cancer (Coppedè et al. 2015), breast cancer (Saritas 2012; Álvarez Menéndez et al. 2010) and leukemia (Afshar et al. 2011). In this study, the artificial neural networks will be used for the detection of biomarkers of lung cancers. After the identification of the biomarkers, a systems biology approach utilizing the artificial neural network interface algorithm will be used to form different genetic interactions between the biomarkers so as to carry out an evaluation of their genes and related functions. A system’s biology approach method described by Agarwal et al. (2014) will be used for the identification of the biomarkers of lung cancer. The interaction of the model in the approach will be visualized and verified as shown in Figure 1.
Figure 1: Systems Biology Approach for the Artificial Neural Networks Interference Algorithm
Milestones
The study is expected to be completed within a four months period (16 weeks). The duration of the anticipated progress of different parts of the study and timing of main outcomes is outlined in the following Gantt chart.
Week | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
Discussion of project aspects and ethical approval | ||||||||||||||||
Literature search and literature review | ||||||||||||||||
Data search in ArrayExpress | ||||||||||||||||
Gene expression microarrays, artificial neural networks and systems biology approach | ||||||||||||||||
Validation and interpretation of results | ||||||||||||||||
Writing study report | ||||||||||||||||
Peer review | ||||||||||||||||
Final report and submission |
Table 1: Gantt Chart Showing the Study Milestones
Reference List
Afshar, S. et al., 2011. Recognition and prediction of leukemia with Artificial Neural Network (ANN). Medical Journal of Islamic Republic of Iran, 25(1), pp.35–39.
Agarwal, D. et al., 2014. A systems biology approach to identify proliferative biomarkers and pathways in breast cancer. In IEEE International Conference on Bioinformatics and Biomedicine. Available at: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6999240.
Álvarez Menéndez, L. et al., 2010. Artificial neural networks applied to cancer detection in a breast screening programme. Mathematical and Computer Modelling, 52(7-8), pp.983–991. Available at: http://linkinghub.elsevier.com/retrieve/pii/S0895717710001378.
ArrayExpress, 2016. ArrayExpress – functional genomics data. ArrayExpress. Available at: https://www.ebi.ac.uk/arrayexpress/.
Bielsa, S. et al., 2008. Accuracy of pleural fluid cytology in malignant effusions. Anales de medicina interna (Madrid, Spain : 1984), 25(4), pp.173–7. Available at: http://www.ncbi.nlm.nih.gov/pubmed/18604333.
Birchard, K., 2011. Transthoracic Needle Biopsy. Seminars in Interventional Radiology, 28(01), pp.087–097. Available at: http://www.thieme-connect.de/DOI/DOI?10.1055/s-0031-1273943.
Campling, B.G. & El-Deiry, W.S., 2003. Clinical implications of p53 mutations in lung cancer. Methods in molecular medicine, 75, pp.53–77. Available at: http://www.ncbi.nlm.nih.gov/pubmed/12407735.
Coppedè, F. et al., 2015. Application of artificial neural networks to link genetic and environmental factors to DNA methylation in colorectal cancer. Epigenomics, 7(2), pp.175–186. Available at: http://www.futuremedicine.com/doi/10.2217/epi.14.77.
Dela Cruz, C.S., Tanoue, L.T. & Matthay, R.A., 2011. Lung Cancer: Epidemiology, Etiology, and Prevention. Clinics in Chest Medicine, 32(4), pp.605–644. Available at: http://linkinghub.elsevier.com/retrieve/pii/S0272523111000943.
Destro, A. et al., 2004. K-ras and p16(INK4A)alterations in sputum of NSCLC patients and in heavy asymptomatic chronic smokers. Lung cancer (Amsterdam, Netherlands), 44(1), pp.23–32. Available at: http://www.ncbi.nlm.nih.gov/pubmed/15013580.
Hanahan, D. & Weinberg, R.A., 2011. Hallmarks of Cancer: The Next Generation. Cell, 144(5), pp.646–674. Available at: http://linkinghub.elsevier.com/retrieve/pii/S0092867411001279.
Hecht, S.S., 2012. Lung carcinogenesis by tobacco smoke. International Journal of Cancer, 131(12), pp.2724–2732. Available at: http://doi.wiley.com/10.1002/ijc.27816.
Hubers, A.J. et al., 2013. Molecular sputum analysis for the diagnosis of lung cancer. British Journal of Cancer, 109(3), pp.530–537. Available at: http://www.nature.com/doifinder/10.1038/bjc.2013.393.
Islami, F., Torre, L.A. & Jemal, A., 2015. Global trends of lung cancer mortality and smoking prevalence. Translational lung cancer research, 4(4), pp.327–38. Available at: http://www.ncbi.nlm.nih.gov/pubmed/26380174nhttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC4549470.
Jiang, Y. et al., 2011. [Analysis of sensitivity and specificity of sputum cytology screening for lung cancer with different positive criteria]. Zhonghua yu fang yi xue za zhi [Chinese journal of preventive medicine], 45(7), pp.605–8. Available at: http://www.ncbi.nlm.nih.gov/pubmed/22041563.
Krogh, A., 2008. What are artificial neural networks? Nature Biotechnology, 26(2), pp.195–197. Available at: http://www.nature.com/doifinder/10.1038/nbt1386.
Kumar, R., Sharma, A. & Tiwari, R., 2012. Application of microarray in breast cancer: An overview. Journal of Pharmacy and Bioallied Sciences, 4(1), p.21. Available at: http://www.jpbsonline.org/text.asp?2012/4/1/21/92726.
Lancashire, L.J. et al., 2010. A validated gene expression profile for detecting clinical outcome in breast cancer using artificial neural networks. Breast Cancer Research and Treatment, 120(1), pp.83–93.
Lancashire, L.J., Lemetre, C. & Ball, G.R., 2008. An introduction to artificial neural networks in bioinformatics–application to complex microarray and mass spectrometry datasets in cancer studies. Briefings in Bioinformatics, 10(3), pp.315–329. Available at: http://bib.oxfordjournals.org/cgi/doi/10.1093/bib/bbp012.
Li, X. et al., 2012. Biomarkers in the Lung Cancer Diagnosis: A Clinical Perspective. Neoplasma, 59(05), pp.500–507. Available at: http://www.elis.sk/index.php?page=shop.product_details&flypage=flypage.tpl&product_id=2896&category_id=87&option=com_virtuemart.
Macgregor, P.F. & Squire, J.A., 2002. Application of microarrays to the analysis of gene expression in cancer. Clin Chem, 48(8), pp.1170–1177.
Marchetti, A. et al., 2009. Clinical implications of KRAS mutations in lung cancer patients treated with tyrosine kinase inhibitors: an important role for mutations in minor clones. Neoplasia (New York, N.Y.), 11(10), pp.1084–92. Available at: http://www.ncbi.nlm.nih.gov/pubmed/19794967.
Mogi, A. & Kuwano, H., 2011. TP53 Mutations in Nonsmall Cell Lung Cancer. Journal of Biomedicine and Biotechnology, 2011, pp.1–9. Available at: http://www.hindawi.com/journals/bmri/2011/583929/.
Petitjean, A. et al., 2007. TP53 mutations in human cancers: functional selection and impact on cancer prognosis and outcomes. Oncogene, 26(15), pp.2157–65. Available at: http://www.ncbi.nlm.nih.gov/pubmed/17401424.
Ridge, C., McErlean, A. & Ginsberg, M., 2013. Epidemiology of Lung Cancer. Seminars in Interventional Radiology, 30(02), pp.093–098. Available at: http://www.thieme-connect.de/DOI/DOI?10.1055/s-0033-1342949.
Rivera, M.P., Mehta, A.C. & Wahidi, M.M., 2013. Establishing the Diagnosis of Lung Cancer. Chest, 143(5), p.e142S–e165S. Available at: http://linkinghub.elsevier.com/retrieve/pii/S0012369213602937.
Saritas, I., 2012. Prediction of Breast Cancer Using Artificial Neural Networks. Journal of Medical Systems, 36(5), pp.2901–2907. Available at: http://link.springer.com/10.1007/s10916-011-9768-0.
Schwartz, A.G. & Cote, M.L., 2016. Epidemiology of lung cancer. Advances in Experimental Medicine and Biology, 893, pp.21–41.
Shigematsu, H. et al., 2005. Clinical and biological features associated with epidermal growth factor receptor gene mutations in lung cancers. Journal of the National Cancer Institute, 97(5), pp.339–46. Available at: http://www.ncbi.nlm.nih.gov/pubmed/15741570.
Song, J.H. et al., 2006. Identification of gene expression signatures for molecular classification in human leukemia cells. International journal of oncology, 29(1), pp.57–64. Available at: http://www.ncbi.nlm.nih.gov/pubmed/16773185.
Steinfort, D.P. et al., 2011. Radial probe endobronchial ultrasound for the diagnosis of peripheral lung cancer: systematic review and meta-analysis. The European respiratory journal, 37(4), pp.902–10. Available at: http://www.ncbi.nlm.nih.gov/pubmed/20693253.
Toyooka, S., Tsuda, T. & Gazdar, A.F., 2003. The TP53 gene, tobacco exposure, and lung cancer. Human mutation, 21(3), pp.229–39. Available at: http://www.ncbi.nlm.nih.gov/pubmed/12619108.
Wang, B. et al., 2001. Detection of p53 gene mutations in sputum samples and their implications in the early diagnosis of lung cancer in suspicious patients. Chinese medical journal, 114(7), pp.694–7. Available at: http://www.ncbi.nlm.nih.gov/pubmed/11780329.
Yang, P., 2009. Epidemiology of Lung Cancer Prognosis: Quantity and Quality of Life. Methods of Mol ecular Biology, 471, pp.469–486. Available at: http://link.springer.com/10.1007/978-1-59745-416-2_24.