Hiroaki Ito, Naoyuki Uragami, Tomokazu Miyazaki, William Yang, Kenji Issha, Kai Matsuo, Satoshi Kimura,Yuji Arai, Hiromasa Tokunaga, Saiko Okada, Machiko Kawamura, Noboru Yokoyama, Miki Kushima, Haruhiro Inoue, Takashi Fukagai, Yumi Kamijo
Hiroaki Ito, Naoyuki Uragami, Kai Matsuo, Noboru Yokoyama, Haruhiro Inoue, Digestive Disease Center, Showa University Koto Toyosu Hospital, Tokyo 135-8577, Japan
Tomokazu Miyazaki, Research Division, JSR Corporation, Tokyo 105-0021, Japan
William Yang, BaySpec Inc., San Jose, CA 95131, United States
Kenji Issha, Fuji Technical Research Inc., Yokohama 220-6215, Japan
Satoshi Kimura, Department of Laboratory Medicine and Central Clinical Laboratory, Showa University Northern Yokohama Hospital, Yokohama 224-8503, Japan
Yuji Arai, Saiko Okada, Department of Clinical Laboratory, Showa University Koto Toyosu Hospital, Tokyo 135-8577, Japan
Hiromasa Tokunaga, Department of Clinical Laboratory, Showa University Hospital, Tokyo 142-8555, Japan, BML Inc., Tokyo 151-0051, Japan
Machiko Kawamura, Department of Hematology, Saitama Cancer Center, Inamachi, Saitama 362-0806, Japan
Miki Kushima, Department of Pathology, Showa University Koto Toyosu Hospital, Tokyo 135-8577, Japan
Takashi Fukagai, Department of Urology, Showa University Koto Toyosu Hospital, Tokyo 135-8577, Japan
Yumi Kamijo, Showa University Koto Toyosu Hospital, Tokyo 135-8577, Japan
Abstract BACKGROUND Colorectal cancer (CRC) is an important disease worldwide, accounting for the second highest number of cancer-related deaths and the third highest number of new cancer cases. The blood test is a simple and minimally invasive diagnostic test. However, there is currently no blood test that can accurately diagnose CRC.AIM To develop a comprehensive, spontaneous, minimally invasive, label-free, bloodbased CRC screening technique based on Raman spectroscopy.METHODS We used Raman spectra recorded using 184 serum samples obtained from patients undergoing colonoscopies. Patients with malignant tumor histories as well as those with cancers in organs other than the large intestine were excluded.Consequently, the specific diseases of 184 patients were CRC (12), rectal neuroendocrine tumor (2), colorectal adenoma (68), colorectal hyperplastic polyp(18), and others (84). We used the 1064-nm wavelength laser for excitation. The power of the laser was set to 200 mW.RESULTS Use of the recorded Raman spectra as training data allowed the construction of a boosted tree CRC prediction model based on machine learning. Therefore, the generalized R2 values for CRC, adenomas, hyperplastic polyps, and neuroendocrine tumors were 0.9982, 0.9630, 0.9962, and 0.9986, respectively.CONCLUSION For machine learning using Raman spectral data, a highly accurate CRC prediction model with a high R2 value was constructed. We are currently planning studies to demonstrate the accuracy of this model with a large amount of additional data.
Key Words: Colorectal cancer; Raman spectroscopy; Machine learning; Blood; Serum;Diagnosis
Colorectal cancer (CRC) is an important disease worldwide. According to Globocan 2018 (http://gco.iarc.fr/today/data/factsheets/cancers/39-All-cancers-fact-sheet.pdf), among all cancers, CRC accounted for the second highest number of deaths and the third highest number of new cases[1]. The blood test is a simple minimally invasive diagnostic test. However, presently, no blood test method can accurately diagnose CRC. Tumor marker tests such as those for carcinoembryonic antigen[2], carbohydrate antigen 19-9 (CA 19-9)[3], CA72-4[4], and CA125[5]are minimally invasive tests for CRC patients and can be performed in many medical institutions. However, these conventional tumor markers have low-detection sensitivity for CRC[6]; they are mostly used for prognostic[7]and recurrence predictions[8]rather than early diagnosis. The presence of cell-free DNA[9]and microRNA (miRNA)[10]has been reported in the blood of CRC patients; however, these tests have not yet become common practice.Circulating cancer cells[11]can be detected in the blood of CRC patients, but it is unclear whether the detection of circulating cancer cells is useful for the early diagnosis of CRC[12].
Raman spectroscopy is a non-destructive method[13]used to analyze the components contained in a sample. This technique can analyze samples in various states (gases,liquids, and solids) without labeling. Research has also been conducted using Raman spectroscopy to diagnose cancer using a blood sample. Raman spectroscopy is useful for the diagnosis of colorectal[14], gastric[15], esophageal[16], pancreatic[17], lung[18],breast[19], prostate[20,21], and bladder[22]cancers. Linet al[14]reported the results of analysis of serum obtained from 38 CRC patients and 45 volunteers by gold nanoparticle-based surface-enhanced Raman spectroscopy, which had diagnostic sensitivity and specificity of 97.4% and 100%, respectively. However, the effectiveness of Raman spectroscopy in cancer diagnosis has not yet been evaluated. We recorded the highly sensitive surface-enhanced Raman scattering spectra of human serum samples using a silver nanocomplex biochip[23,24]. There were significant differences in the scattered light intensities of Raman shifts, attributed to specific molecular bonds, between the serum samples of cancer patients with stomach or colon cancer and those with benign disease. However, the procedure was complicated, and the detection of substances was limited by the fact that specific silver nanoparticles should be used. Thus, we decided to develop another comprehensive, label-free Raman technique to detect known and unknown substances in the serum. The subsequent preliminary study showed that our Raman spectroscopy system could detect spontaneous Raman scattering spectra from untreated human serum samples within 1 min[25]. Hence, we considered that serum analysis based on Raman spectroscopy could provide a rapid cancer diagnosis.
In this study, we confirmed the correlation between the Raman scattering spectra of the serum samples collected before examination and the endoscopic diagnosis of patients who underwent colonoscopies. Additionally, we constructed a model that predicts cancer with increased accuracy based on machine learning.
This study included patients who underwent colonoscopies at the Showa University Koto Toyosu Hospital (Tokyo, Japan) between September 2018 and September 2019.Patients were excluded if they were < 20-years-old, > 80-years-old, had a history of malignancy, or had malignant diseases in organs other than the colon. The protocol in this study complied with the Declaration of Helsinki and the Clinical Trial Act in Japan. The study protocol was reviewed and approved (No. 18T5005) by the Institutional Review Board of Showa University Koto Toyosu Hospital. All participants provided written consent for their participation in this study. The study protocol was registered in the University Hospital Medical Information Network clinical trial registry (UMIN-CTR, No. UMIN000034306).
Based on previous clinical research at the Showa University Koto Toyosu Hospital,at least 150 patients were required to capture > 3 CRC patients. Therefore, 184 patients were recruited for the study (110 men and 74 women, aged between 20 and 80 years).The median and average ages were 57 and 56.9 years, respectively (Table 1).
In addition to colonoscopy, gastroscopy, computed tomography, ultrasonography,and magnetic resonance imaging were performed in 100, 51, 5, and 5 cases,respectively.
The primary diagnoses of the 184 patients recruited in this study were: CRC in 12 cases (3, tumor-node-metastasis [TNM] stage 0; 1, TNM stage I; 2, TNM stage II; 5,TNM stage III; 1, TNM stage IV), rectal carcinoids in 2 cases, colorectal adenomas in 68 cases, colon hyperplastic polyps in 18 cases, other diseases in 54 cases (1, leiomyoma;9, nonspecific colitis; 2, ulcerative colitis; 24, colon diverticulum; 3, nonspecific ileitis;and 15, internal hemorrhoid), and no specific finding in 30 cases (Table 2). One CRC patient was also diagnosed with colorectal adenoma and another with ulcerative colitis. Sixteen of sixty-eight patients with colorectal adenoma were also diagnosed with colorectal hyperplastic polyps. All cancers and adenomas were histopathologically diagnosed by at least two qualified clinical pathologists. Polypswere diagnosed endoscopically as hyperplastic polyps are not usually treated.
Table 1 Patients and evaluations
Table 2 Main diagnosis
Blood samples were collected prior to endoscopic examinations. Serum samples were obtained by centrifuging blood samples for 5 min (1500 ×g). The extracted serum samples were dispensed into 2.0 mL hyperplastic polypropylene microtubes(Biosphere?plus; Sarstedt Ag & Co. Kg, Sarstedtstra?e, Nümbrecht, Germany), which were free from DNA, DNase/RNase, polymerase chain reaction inhibitor, adenosine triphosphate, and pyrogens/endotoxins. The specimens were preserved at -80 °C in an ultralow temperature freezer (MDF-C8V1; Panasonic Corporation, Osaka, Japan).
A Nomadic Raman microscope with a computer-controlled electrical stage running Pathologic System Software Version 1.0.1.0 (BaySpec Inc., San Jose, CA, United States)was used for analysis. The details of the Raman microscope used in this study have been described in a previous paper[25], and the outline is given below. A wavelength of 1064 nm was selected as the excitation laser. The power of the laser was set to 200 mW.A 20 × magnifying objective lens with a correction collar with near-infrared microscopy (LCPLN20XIR; Olympus Corporation, Tokyo, Japan) and a 2048 × 64 pixel thermoelectric cooled indium gallium arsenide, charge-couple device (CCD) detector,with a spectral range of 100-3200 cm-1(grating 4 cm-1) were used to record the spectra.A CCD camera with 1392 × 1040 colors and a maximum acquisition rate of 30 frames per second (Lw135R; Lumenera Corporation, Capella Court, Ottawa, ON, Canada)was used for focusing before each Raman scattering spectral acquisition (Figure 1). The dark background noise of the CCD camera was acquired in the form of a spectrum in the absence of a sample and was subtracted from all spectra acquired in which the sample was present. Baseline correction was performed for each spectrum with the Pathologic System Software (BaySpec Inc.). Moreover, some figures in this paper were created using RaspWin Ver 8.0.1 (HT SoftLab) and Adobe Illustrator CS6 Version 16.2.0 (Adobe Systems Incorporated, San Jose, CA, United States).
An overview of the measurement and analysis processes is summarized in Online resource 10. The cryopreserved serum was thawed immediately before the measurement and was set at 25 °C. We manually placed a drop of the serum stock solution on the tip of a thin stainless-steel tube. The serum was irradiated with a 1064-nm wavelength laser three times for 15 s, and the average value was recorded as the Raman scattering spectra. A new droplet was then prepared, and the same measurement was repeated three times for each serum sample.
The scattered light intensity (15 range, A1-A15) of the Raman shift related to nucleic acids[26-34], proteins[28,30-32,34,35], and lipids[27,28,30,32,34,36-41]in serum was extracted from the obtained Raman spectra (Figures 2 and 3, Table 3). The average values of the three extracted scattered light intensities from each spectrum were analyzed as training data by the boosted tree model[42]with the use of JMP?Pro 14.3.0 (SAS Institute Inc., Cary,NC, United States). The average values for the three extracted scattered light intensities among the patient groups were evaluated for normality using the Tukey test. Intergroup differences in non-normally distributed data were compared using the Steel-Dwass nonparametric test. All analyses were performed using JMP?Pro 14.3.0(SAS Institute Inc.).
The relevant definitions were: EntropyR-squared = 1 ? Loglike(model)/Loglike(0);GeneralizedR-squared = {1 ? [L(0)/L(model)](2/n)}/[1 ? L(0)(2/n)]; Mean Logp =∑?Log[ρ(j)]/n; Root-mean-square error = √∑[y(j) ? ρ(j)]2/n; Mean absolute deviation =∑|y(j) ? ρ(j)|/n; and Misclassification rate = ∑[ρ(j) ≠ ρMax]/n(Supplementary Table 1). The detailed conditions for the analysis based on the boosted tree model were:Number of layers = 200 (maximum number of layers to include in the final tree is 200);Splits per tree = 3 (number of splits for each layer is 3); Learning rate = 0.1; Overfit penalty = 0.0001; Minimum size split = 5; Row sampling rate = 1; and Column sampling rate = 1 (Supplementary Table 2).
Raman spectra of all serum samples were recorded, and the highest values of the scattered light intensity, ranging from A1 to A15, were extracted (Figures 2 and 3). The boosted tree model was used to predict CRC, and a highly accurate model was constructed based on a generalizedR2value of 0.9977 and an entropyR2value of 0.9982 (Supplementary Table 3). Similarly, the boosted tree model was used to predict colorectal adenomas and hyperplastic polyps, and a highly accurate adenoma prediction model was constructed based on a generalizedR2value of 0.9269, and an entropyR2value of 0.9630 (Supplementary Table 4). Furthermore, a highly accurate hyperplastic polyp prediction model was constructed based on a generalizedR2value of 0.9947 and an entropyR2value of 0.9962 (Supplementary Table 5). The boosted tree model was used to predict rectal neuroendocrine tumors based on data from two patients, and a highly accurate rectal neuroendocrine tumor prediction model was also constructed based on a generalizedR2value of 0.9985 and an entropyR2value of 0.9986 (Supplementary Table 6).
Table 3 Assignment of serum sample
Figure 1 Schematic of the confocal micro-Raman spectrometer used in this study. A nomadic Raman microscope with an excitation laser at a wavelength of 1064 nm was used in this study. A 20 × magnifying objective lens with a correction collar with near-infrared microscopy (LCPLN20XIR; Olympus Corporation, Tokyo, Japan) and a 2048 × 64 pixel thermoelectric cooled indium gallium arsenide (InGaAs), charge-couple device (CCD) detector, with a spectral range of 100-3200 cm-1 (grating 4 cm-1) were used to record the spectra.
Figure 2 Raman spectra of serum samples from patients. The spectra of the serum samples from patients with rectal cancer, colon adenoma, and colon hyperplastic polyp, and the patient with no specific findings.
Figure 3 Raman spectra of patient serum samples and assignments. Raman spectra of serum sample from the patient with internal hemorrhoid (53-years-old, female), and selected range of Raman shift.
The major Raman shifts[17-19,21-37,43-45]that contributed to the prediction of the presence of CRC (effect > 0.1) were higher in the order of A10 (1275-1295 cm-1; amide III), A8 (1123-1143 cm-1; C-N and skeletal C-C), and A3 (751-771 cm-1; DNA, pyrimidines [cytosine,thymine, uracil], and tryptophan). The major Raman shifts that contributed to the prediction of the presence of colorectal adenoma (effect > 0.1) were higher in the order of A4 (830-859 cm-1; tyrosine and pro C-C), and A10 (1275-1295 cm-1; amide III). The major Raman shifts that contributed to the prediction of colorectal polyps (effect > 0.1)were higher in the order of A7 (1091-1111 cm-1; PO2stretching and skeletal C-C), A1(611-631 cm-1; phenylalanine), and A4 (830-859 cm-1; tyrosine and pro C-C). A10 (1275-1295 cm-1; amide III) was the Raman shift that affected the prediction of the presence of cancer and adenomas. A4 (830-859 cm-1; tyrosine and pro C-C) was the Raman shift that influenced the prediction of both adenomas and polyps. The only Raman shift that influenced the prediction of rectal neuroendocrine tumors was A11, which was different from those for the prediction of cancer, adenomas, and polyps (Table 4,Supplementary Tables 3-6). We investigated the maximum scattered light intensities of the Raman shifts of A10, A8, and A3 that contributed to the prediction of the presence of CRC. Compared with the group with adenomas, hyperplastic polyps, and no specific findings and/or other diseases, the mean scattered light intensity of A8 tended to be higher, and the mean scattered light intensities of A3 and A10 tended to be lower in the cancer samples (Figure 4). However, there was no significant difference based on the Steel-Dwass test (Supplementary Table 7).
Table 4 Boosted tree model for the prediction of colorectal disease
Numerous studies have been performed to diagnose cancer using blood tests.However, tumor markers, free DNA[9], miRNA[10], and circulating cancer cells[11]have been the main targets of blood-based CRC diagnostic techniques. To date, highprecision technologies have not been developed. Additionally, standard blood-based procedures for cancer diagnosis have not yet been established. Raman spectroscopy is an analytical method that can quickly evaluate the components of unlabeled samples that have not been pre-treated[46]. However, the measurement of Raman spectra is strongly inhibited by autofluorescence[47], and thus it is difficult to analyze a biological sample with high accuracy. Furthermore, given that the detection sensitivity of labelfree, spontaneous Raman spectroscopy is lower than other labeling techniques, it is difficult to detect minute quantities of target substances. Correspondingly, it is necessary to enhance scattered light using various methods, including the use of surface-enhanced Raman spectroscopy. In fact, most Raman spectroscopic analyses of blood performed to date have utilized surface-enhanced Raman spectroscopy[19,20,48-50].Some studies have used small sample sizes for blood-based diagnoses of CRC with Raman spectroscopy. Almost all of these studies have been based on the surfaceenhanced Raman scattering technique[14,51-54]. Principal component analysis (PCA)[51],partial least squares[52,53], linear discriminant analysis (LDA)[52], and PCA-LDA[14,54]were used for evaluation, and respective diagnostic sensitivities and specificities of 86.4%-100% and 80%-100% were reported[14,51-54]. However, the surface-enhanced Raman scattering technique is highly sensitive. Furthermore, target substances are limited by the designed specifications of colloids based on precious metals, such as gold and silver. Therefore, important unknown substances that can be used for the detection of cancer may escape when surface-enhanced Raman scattering techniques are used. In fact, significant molecular bonds present in the serum and involved in cancer detection have varied[14,51-54]. Furthermore, the Raman scattered light enhancement technique does not have the advantages of Raman spectroscopy. For example, it is not label-free,more complex, and its response is slower. It would therefore be ideal if diagnosis could be established at high accuracy using ordinary, spontaneous Raman spectroscopy. Our label-free, spontaneous Raman spectroscopy system could detect known and unknown substances in a comprehensive manner. In this study, a highly accurate CRC prediction model with a generalizedR2value that exceeded 0.99 was constructed. We used a near-infrared laser as the excitation light source that was hardly affected by autofluorescence. Furthermore, a microtube was developed for serum measurements. Therefore, the Raman spectra of the serum could be obtained without a surface-enhanced Raman technique.
Figure 4 Comparison of intensity. Outlier box plot depicting the intensity of the scattered light of the sera from the studied patients. The bottom and top parts of the box show the lower and upper quartiles, and the band across the box indicates the median. The lower and upper bars at the ends of the whiskers show the lowest data point within a range spanning 1.5 interquartile ranges of the lower quartile, and the highest data point within a range spanning 1.5 interquartile ranges of the upper quartile. The dots denote outliers that extend beyond the whiskers. The diagonal square indicates average values. However, with no significant deference, the mean scattered light intensity of A8 is higher in the cancer group (235.6) than in the adenoma (229.4), hyperplastic polyp (229.7), and other disease and/or no specific findings (231.0) groups. In addition, with no significant deference, the mean scattered light intensity of A10 is lower in the cancer group (263.4) than in the adenoma (272.4), hyperplastic polyp (272.1), and other disease and/or no specific findings (271.6) groups. With a slight difference, the mean scattered light intensity of A3 tended to be lower in the cancer group (251.3) than that in those with adenomas (255.2), hyperplastic polyps (254.9), and other disease and/or no specific findings (253.9).
In this study, our system could accurately predict the presence of CRC, adenomas,and hyperplastic polyps. Even though our system could also predict the presence of rectal neuroendocrine tumors, the number of patients with rectal neuroendocrine tumors in this study was only two. Accordingly, findings should be confirmed with additional future studies. There is a possibility that our less invasive blood-based test could be used for screening CRC. In addition, this technique can comprehensively detect all known and unknown molecules contained in a sample. Therefore, by constructing an optimal algorithm based on the Raman spectrum, there is a possibility that it can be applied to the diagnosis of various diseases, including congenital,genetic, and metabolic diseases.
The Raman shifts (effect > 0.1) in our CRC prediction model were in the ranges of 1275-1295 cm-1(amide III), 1123-1143 cm-1(C-N and skeletal C-C), and 751-771 cm-1[DNA, pyrimidines (cytosine, thymidine, uracil), and tryptophan] (Table 3).Additionally, the scattered light intensities of 1275-1295 cm-1(amide III), 1123-1143 cm-1(C-N and skeletal C-C), and 751-771 cm-1[DNA, pyrimidines (cytosine, thymidine,uracil), and tryptophan] of the sera obtained from CRC patients were respectively low,high, and low. Compared with the results of our study, Fenget al[52]reported the exact opposite result with a surface-enhanced Raman scattering technique. Honget al[51]also reported that the scattered light of the serum from CRC patients was relatively high intensity at 1275-1295 cm-1(amide III). Other studies reported no difference regarding these Raman shifts[14,53,54]. These discrepancies may be attributed to differences between spontaneous Raman spectra and surface-enhanced Raman spectra or to the inaccuracies of the analyzed results owing to the small data sizes. In this study, only the disease was used as an index, and the effects of age, sex, presence or absence of comorbidity, total protein concentration, albumin concentration, and other biological factors were not considered. We need to carefully analyze more samples after matching these factors to enhance our CRC prediction model.
The Raman shift that had a strong effect on the prediction of the presence of colorectal adenomas and hyperplastic polyps was partially consistent with the Raman shift used for the prediction of CRC. For the prediction of colorectal adenomas, effects larger than 0.01 were in the range of 830-859 cm-1(tyrosine and pro C-C) and 1275-1295 cm-1(amide III). Neither our or previous studies have shown a clear link between colorectal adenomas and serum tyrosine levels. The “adenoma-carcinoma sequence”[55]was estimated. Accordingly, it has been suggested that abnormalities in Wnt signaling in colorectal adenomas were associated with the development of CRC[56,57]. In this study, the amide III intensity level was a significant factor for CRC and adenomas.Furthermore, the amide III intensity level was low in cancer and high in adenomas.Thus, serum amide III levels may be used to assess the risk of CRC. For the prediction of hyperplastic colon polyps, effects were larger in the ranges of 1091-1111 cm-1(PO2stretching and skeletal C-C), 611-631 cm-1(phenylalanine), and 830-859 cm-1(tyrosine and pro C-C). The Raman shift at 830-859 cm-1(tyrosine and pro C-C) was also the appropriate Raman shift for the prediction of colorectal adenomas. This overlap may be associated with the transition of hyperplastic colorectal polyps to colorectal adenomas and may be predictive for the development of colorectal adenomas[58]. In this study, all polyps were diagnosed endoscopically. Therefore, the possibility of adenomas cannot be ruled out.
The limitations of this study were as follows. First, the number of subjects was small; the total number of subjects was 184, of which only 12 were CRC patients.Additionally, since there were only two samples from patients with neuroendocrine tumors, we could not give definitive results for neuroendocrine tumors in this study.In the future, verification with more subjects must be performed. Second, this study did not consider the effects of biological factors other than diseases such as age, sex,presence or absence of co-morbidities, and total protein or albumin concentrations. We need to revalidate the technology in this study with prospective trials tailored to the subject’s background. Third, the patients analyzed in this study received inconsistent clinical examinations that may have been associated with various degrees of accuracies regarding clinical diagnoses. They may have also had malignancies of organs other than the colon. Fourth, it was not clear whether our technology was able to detect high-risk states of carcinogenesis or the cancer cells themselves after the onset of carcinogenesis. If the moieties detected by our technology are derived from cancer cells, this technology can be used to determine the therapeutic effects of cancer.
In summary, we could present a model for diagnosing CRC with serum and machine learning in this study. However, the clinical usefulness of this model is still undecided. We plan to continue our research in the future to improve the accuracy of our results. First, we will include larger cohorts. Second, we will increase the number of target cancer types. Third, blood will be collected twice before and after the treatment, and the Raman spectra of the serum will be compared. From these results,we believe that the significance of our label-free, spontaneous Raman spectroscopy technology in cancer treatment will become clearer. Regarding the Raman spectral analysis, a comprehensive machine learning method capable of analyzing the entire spectrum will be designed and implemented.
We studied a minimally invasive and accurate diagnostic method of CRC, which is the second most common cancer-related death worldwide. We evaluated that analysis of Raman spectrum of serum by machine learning could be an excellent diagnostic method for CRC. Since Raman spectroscopy is greatly affected by autofluorescence, it is technically difficult to analyze biological samples by Raman spectroscopy. We have succeeded in recording the Raman spectrum of untreated serum with high accuracy by using a near-infrared laser, which is less affected by autofluorescence, as the excitation light source. Then, using the recorded Raman spectra as data, we constructed a CRC prediction model by "Boosted Tree Model" which is a kind of machine learning.Although this model may predict CRC with high accuracy, we should analyze more clinical data to confirm the clinical usefulness of this model.
Colorectal cancer (CRC) is an important disease worldwide. Among all cancers, CRC accounts for the second highest number of deaths and the third highest number of new cases. The blood test is a simple minimally invasive diagnostic test. However,presently, no blood test method can accurately diagnose cancer.
We have tried to develop a comprehensive, spontaneous, minimally invasive, labelfree,blood-based CRC screening technique based on Raman spectroscopy.
We used Raman spectra recorded using 184 serum samples obtained from patients(CRC in 12 patients, rectal neuroendocrine tumor in 2 patients, colorectal adenoma in 68 patients, colorectal hyperplastic polyp in 18 patients, and others in 84 patients)undergoing colonoscopies.
We used Raman spectra recorded using 184 serum samples. We used 1064-nm wavelength laser for excitation.
Use of the recorded Raman spectra as training data allowed the construction of a boosted tree CRC prediction model based on machine learning. Therefore, the generalized R2 values for CRC, adenomas, hyperplastic polyps, and neuroendocrine tumors were 0.9982, 0.9630, 0.9962, and 0.9986, respectively.
We could show a diagnostic model of machine learning using Raman spectral data, a highly accurate CRC prediction with a high R2 value.
We are currently planning studies to demonstrate the clinical usefulness of this model with a vast volume of additional data.
We are grateful to the patients who donated blood. We are also grateful to the clinical staff at Showa University, Koto Toyosu Hospital. We would like to thank Ms. Iono A and Mr. Oi H (SAS Institute Inc.) for their dedicated support in machine learning.
World Journal of Gastrointestinal Oncology2020年11期