Line 1: Line 1:
  
 
== Abstract ==
 
== Abstract ==
Multiple myeloma (MM) is a clonal cell cancer characterized by excessive cell division of plasma cells in the bone marrow, which can then overcrowd healthy cells. As a result, end organ damage to kidneys, bones, and the liver occurs. The worldwide incidence of MM amounted to 160,000 cases in 2018 and 106,000 patients have succumbed to the disease. MM is diagnosed relatively well by detecting M monoclonal protein produced from cancerous cells, yet mortality rates remain high because there is a lack of a specific treatment. By identifying differentially expressed genes found in malignant plasma cells, scientists can develop new and stronger therapies tailored to potential driver genes. This study takes a novel machine learning approach to identify driver genes of MM. In this study, single-cell RNA sequencing data was obtained from the Gene Expression Omnibus database containing 26 patients in various disease stages and 9 healthy donors, totaling 29,367 plasma cells and 22,088 genes. This study evaluated the performance of three machine learning models: Random Forest (RF), Support Vector Machine (SVM) and K-Nearest Neighbors (KNN), with RF achieving the highest accuracy of 95.61% of correctly diagnosing a cell to a stage of MM. Principal components identified ANKRD28, CXCR4, HLA-DPA1 among several other potential driver genes that have been cross-validated with previous literature. Notably, the models identified RP5-1171I10.5–a gene not yet established to be associated with multiple myeloma which shows potential to be further studied for research. These genes show potential to be further studied for specific targeted genetic therapy.
+
 
 +
Multiple myeloma (MM) is a clonal cell cancer characterized by excessive cell division of plasma cells in the bone marrow, which can then overcrowd healthy cells. As a result, end organ damage to kidneys, bones, and the liver occurs. The worldwide incidence of MM amounted to 160,000 cases in 2018 and 106,000 patients have succumbed to the disease. MM is diagnosed relatively well by detecting M monoclonal protein produced from cancerous cells, yet mortality rates remain high because there is a lack of a specific treatment. By identifying differentially expressed genes found in malignant plasma cells, scientists can develop new and stronger therapies tailored to potential driver genes. This study takes a novel machine learning approach to identify driver genes of MM. In this study, single-cell RNA sequencing data was obtained from the Gene Expression Omnibus database containing 26 patients in various disease stages and 9 healthy donors, totaling 29,367 plasma cells and 22,088 genes. This study evaluated the performance of three machine learning models: Random Forest (RF), Support Vector Machine (SVM) and K-Nearest Neighbors (KNN), with RF achieving the highest accuracy of 95.61% of correctly diagnosing a cell to a stage of MM. Principal components identified ANKRD28, CXCR4,
  
 
== Full document ==
 
== Full document ==
<pdf>Media:Draft_Kim_152773742-2732-document.pdf</pdf>
+
<pdf>Media:Draft_Kim_152773742-4472-document.pdf</pdf>

Revision as of 00:56, 29 June 2023

Abstract

Multiple myeloma (MM) is a clonal cell cancer characterized by excessive cell division of plasma cells in the bone marrow, which can then overcrowd healthy cells. As a result, end organ damage to kidneys, bones, and the liver occurs. The worldwide incidence of MM amounted to 160,000 cases in 2018 and 106,000 patients have succumbed to the disease. MM is diagnosed relatively well by detecting M monoclonal protein produced from cancerous cells, yet mortality rates remain high because there is a lack of a specific treatment. By identifying differentially expressed genes found in malignant plasma cells, scientists can develop new and stronger therapies tailored to potential driver genes. This study takes a novel machine learning approach to identify driver genes of MM. In this study, single-cell RNA sequencing data was obtained from the Gene Expression Omnibus database containing 26 patients in various disease stages and 9 healthy donors, totaling 29,367 plasma cells and 22,088 genes. This study evaluated the performance of three machine learning models: Random Forest (RF), Support Vector Machine (SVM) and K-Nearest Neighbors (KNN), with RF achieving the highest accuracy of 95.61% of correctly diagnosing a cell to a stage of MM. Principal components identified ANKRD28, CXCR4,

Full document

The PDF file did not load properly or your web browser does not support viewing PDF files. Download directly to your device: Download PDF document
Back to Top

Document information

Published on 29/07/23
Submitted on 28/06/23

Volume 5, 2023
Licence: CC BY-NC-SA license

Document Score

0

Views 37
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?