Advertisement
Research Article| Volume 10, 100484, 2023

Optimizing MRI-based brain tumor classification and detection using AI: A comparative analysis of neural networks, transfer learning, data augmentation, and the cross-transformer network

Open AccessPublished:March 14, 2023DOI:https://doi.org/10.1016/j.ejro.2023.100484

      Abstract

      Early detection and diagnosis of brain tumors are crucial to taking adequate preventive measures, as with most cancers. On the other hand, artificial intelligence (AI) has grown exponentially, even in such complex environments as medicine. Here it’s proposed a framework to explore state-of-the-art deep learning architectures for brain tumor classification and detection. An own development called Cross-Transformer is also included, which consists of three scalar products that combine self-care model keys, queries, and values. Initially, we focused on the classification of three types of tumors: glioma, meningioma, and pituitary. With the Figshare brain tumor dataset was trained the InceptionResNetV2, InceptionV3, DenseNet121, Xception, ResNet50V2, VGG19, and EfficientNetB7 networks. Over 97 % of classifications were accurate in this experiment, which provided a network’s performance overview. Subsequently, we focused on tumor detection using the Brain MRI Images for Brain Tumor Detection and The Cancer Genome Atlas Low-Grade Glioma database. The development encompasses learning transfer, data augmentation, as well as image acquisition sequences; T1-weighted images (T1WI), T1-weighted post-gadolinium (T1-Gd), and Fluid-Attenuated Inversion Recovery (FLAIR). Based on the results, using learning transfer and data augmentation increased accuracy by up to 6 %, with a p-value below the significance level of 0.05. As well, the FLAIR sequence was the most efficient for detection. As an alternative, our proposed model proved to be the most effective in terms of training time, using approximately half the time of the second fastest network.

      Keywords

      1. Introduction

      Cancer is one of the most common diseases worldwide, with an estimated 1.8 million new cases and more than 600,000 deaths in 2020 in the United States alone [
      American Cancer Society, Cancer Facts & Figures 2020, 1–76, 2020.
      ,
      Cancer, https://www.who.int/en/news-room/fact-sheets/detail/cancer (accedido 17 de noviembre de 2020).
      ]. Cancer is a disease characterized by the uncontrolled growth of abnormal cells in the body. It is caused by mutations or changes in the function of cells [
      • Mack T.M.
      What a cancer is.
      ], which leads to the loss of the cell's ability to undergo programmed cell death [
      • Ray S.D.
      • Yang N.
      • Pandey S.
      • Bello N.T.
      • Gray Y.J.P.
      Apoptosis.
      ]. This results in the formation of tumors and affects various organs and tissues [
      • Foster J.R.
      Introduction to Neoplasia.
      ,
      • Yokota J.
      Tumor progression and metastasis.
      ]. Cancer can be difficult to detect depending on the affected organ or cause treatment complications [
      • Ost D.E.
      • Gould Y.M.K.
      Decision making in patients with pulmonary nodules.
      ,
      • Auvinen A.
      • Hakama Y.M.
      Cancer screening: theory and applications.
      ]. For example, brain cancer involves CNS parts, making it difficult to perform surgery or radiotherapy to remove the affected regions [
      • Huang R.
      • Boltze J.
      • Li Y.S.
      Strategies for improved intra-arterial treatments targeting brain tumors: a systematic review.
      ].
      Brain tumors, while they rarely spread to other parts of the body, can still be dangerous as they can grow quickly and damage brain tissue as they diffuse to nearby areas. The growth can press on brain tissue, causing high-impact complications even if the tumors are benign [
      • Moon S.-J.
      • Ginat D.T.
      • Tubbs R.S.
      • Moisi Y.M.D.
      Tumors of the brain.
      ,
      • Sontheimer H.
      Brain tumors.
      ]. Brain tumors account for approximately 2.17 % of all cancer deaths and the five-year survival rate is low, at around 5.6 % for glioblastoma [
      • Reynoso-Noverón N.
      • Mohar-Betancourt A.
      • Ortiz-Rafael Y.J.
      Epidemiology of brain tumor.
      ]. The impact of brain tumors and the concerning statistics have motivated ongoing research in the field [
      • Turner N.
      • Vidovic Y.N.
      Cancer health concerns.
      ], with physicians and scientists searching for ways to prevent tumors, more efficient treatments, better diagnostic tests, and better ways to study and classify tumors [
      • Troyanskaya O.
      • Trajanoski Z.
      • Carpenter A.
      • Thrun S.
      • Razavian N.
      • Oliver Y.N.
      Artificial intelligence and cancer.
      ,
      • Bi W.L.
      • et al.
      Artificial intelligence in cancer imaging: Clinical challenges and applications.
      ]. This research includes new methods for exploring brain anatomy and the development of AI systems [
      • Hosny A.
      • Parmar C.
      • Quackenbush J.
      • Schwartz L.H.
      • Aerts Y.H.J.W.L.
      Artificial intelligence in radiology.
      ].
      Several tools can be used to detect brain abnormalities such as computed tomography (CT), positron emission tomography (PET), magnetoencephalography (MEG) and magnetic resonance imaging (MRI) are among the most used [
      • Sharif M.I.
      • Li J.P.
      • Naz J.
      • Rashid Y.I.
      A comprehensive review on multi-organs tumor detection based on machine learning.
      ,
      • Bhatele K.R.
      • Bhadauria Y.S.S.
      Brain structural disorders detection and classification approaches: a review.
      ]. MRI is considered the most popular and effective method for detecting brain abnormalities because it can distinguish between different structures and tissues and it does not use ionizing radiation, making it safe for patients [
      • Pauli R.
      • Wilson Y.M.
      The basic principles of magnetic resonance imaging.
      ]. AI has been applied in the field of brain tumor detection, classification, segmentation, diagnosis, and evolution [
      • Bi W.L.
      • et al.
      Artificial intelligence in cancer imaging: Clinical challenges and applications.
      ,
      • Duong M.T.
      • Rauschecker A.M.
      • Mohan Y.S.
      Diverse applications of artificial intelligence in neuroradiology.
      ,
      • Nazir M.
      • Shakil S.
      • Khurshid Y.K.
      Role of deep learning in brain tumor detection and classification (2015 to 2020): A review.
      ,
      • Işın A.
      • Direkoğlu C.
      • Şah Y.M.
      Review of MRI-based brain tumor image segmentation using deep learning methods.
      ]. The application of AI, especially DL-based methods, has demonstrated high levels of accuracy comparable to that of expert radiologists [
      • Aggarwal R.
      • et al.
      Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis.
      ,
      • Serte S.
      • Serener A.
      • Al‐Turjman Y.F.
      Deep learning in medical imaging: a brief review.
      ,
      • He K.
      • Zhang X.
      • Ren S.
      • Sun Y.J.
      Delving deep into rectifiers: surpassing human-level performance on imagenet classification.
      ].
      Various developments in AI have led to multiple models or architectures being developed to handle various tasks, but on natural images [
      • Schmidhuber J.
      Deep learning in neural networks: an overview.
      ,
      • Ravi D.
      • et al.
      Deep learning for health informatics.
      ,
      • Liu W.
      • Wang Z.
      • Liu X.
      • Zeng N.
      • Liu Y.
      • Alsaadi Y.F.E.
      A survey of deep neural network architectures and their applications.
      ,
      • Dong S.
      • Wang P.
      • Abbas Y.K.
      A survey on deep learning and its applications.
      ]. However, the results cannot be fully extrapolated to medical imaging, because of the own physical and physiological characteristics recorded on it [
      • Zhang X.
      • Smith N.
      • Webb Y.A.
      Medical imaging.
      ]. In line with that, there are machine learning approaches that have been developed for the detection of neoplasms, such as the use of basic algorithms like k Nearest Neighbor (kNN algorithm) with an accuracy of 98.2 % [
      • Deepa G.
      • Mary G.L.R.
      • Karthikeyan A.
      • Rajalakshmi P.
      • Hemavathi K.
      • Dharanisri Y.M.
      Detection of brain tumor using modified particle swarm optimization (MPSO) segmentation via haralick features extraction and subsequent classification by KNN algorithm.
      ], or the use of principal component analysis (PCA) with a sensitivity, specificity, and accuracy of 97.36 %, 100 %, and 95.0 % [
      • Islam M.K.
      • Ali M.S.
      • Miah M.S.
      • Rahman M.M.
      • Alam M.S.
      • Hossain Y.M.A.
      Brain tumor detection in MR image using superpixels, principal component analysis and template based K-means clustering algorithm.
      ]. Another example is the use a of support vector machine (SVM) to differentiate between benign and malignant tumors, with an accuracy of 99.24 %, precision of 95.83 %, and recall of 95.30 % [
      • Bhagat N.
      • Kaur Y.G.
      MRI brain tumor image classification with support vector machine.
      ]. Lastly, an ensemble method comprising of Bagging Classifier, Random Forest, Extra Trees Classifier, Gradient Boosting, Extreme Gradient Boosting, and Adaptive Boosting algorithms, has been used to achieve an accuracy of 94.07, precision of 90.78, recall of 93.33, specificity of 94.44, and F1_score of 91.52 [
      • Chandra Joshi R.
      • Mishra R.
      • Gandhi P.
      • Pathak V.K.
      • Burget R.
      • Dutta Y.M.K.
      Ensemble based machine learning approach for prediction of glioma and multi-grade classification.
      ].
      In the field of Deep Learning, advancements have been made in detecting physiological anomalies using Deep Belief Networks (DBN) with an accuracy of over 94.11 % [
      • Sathies Kumar T.
      • Arun C.
      • Ezhumalai Y.P.
      An approach for brain tumor detection using optimal feature selection and optimized deep belief network.
      ]. Other developments include the detection of brain metastasis using single-shot detection models on CT scans with a sensitivity of 88.7 % [
      • Takao H.
      • Amemiya S.
      • Kato S.
      • Yamashita H.
      • Sakamoto N.
      • Abe Y.O.
      Deep-learning single-shot detector for automatic detection of brain metastases with the combined use of contrast-enhanced and non-enhanced computed tomography images.
      ] and the use of Wasserstein adversarial generative networks (WGAN) for cancer diagnosis [
      • Xiao Y.
      • Wu J.
      • Lin Y.Z.
      Cancer diagnosis using generative adversarial networks based on deep learning from imbalanced data.
      ]. Additionally, feature-based artificial neural networks (ANN) and the Extended Set-Membership Filter (ESMF) have been used to diagnose brain tumors with an accuracy of 97.14 % and 88.24 % respectively. Future research should focus on classifying abnormalities into benign and malignant tumors [
      • Song G.
      • Shan T.
      • Bao M.
      • Liu Y.
      • Zhao Y.
      • Chen Y.B.
      Automatic brain tumour diagnostic method based on a back propagation neural network and an extended set-membership filter.
      ]. In that way, some research utilizes more robust DL-oriented approaches, such as using a convolutional network with a new regularization method called Mixed-Pooling-Dropout, which results in a classification accuracy of 92.6 % compared to 86.8 % for traditional clustering methods [
      • Ait Skourt B.
      • El Hassani A.
      • Majda Y.A.
      Mixed-pooling-dropout for convolutional neural network regularization.
      ]. Another study employed DarkNet for brain tumor classification and segmentation by incorporating data augmentation and transfer learning with a Figshare database [] of 708 meningiomas, 1426 gliomas, and 930 pituitaries, achieving an accuracy of 98.54 % [
      • Ahuja S.
      • Panigrahi B.K.
      • Gandhi Y.T.K.
      Enhanced performance of Dark-Nets for brain tumor classification and segmentation using colormap-based superpixel techniques.
      ].
      Tandel et al. used a combination of five DL networks (AlexNet, VGG16, ResNet18, GoogleNet, and ResNet50) and five conventional machine learning algorithms (Support Vector Machine, K-Neighbors, Naive Bayes, Decision Tree, and Linear Discrimination) to classify by majority vote using a five-fold cross-validation scheme. They also used data augmentation methods such as scaling and rotation and incorporated transfer learning. They achieved average scores over 97.10 % in accuracy, sensitivity, specificity, Area Under the Receiver, Positive predictive value, and negative predictive value [
      • Tandel G.S.
      • Tiwari A.
      • Kakde Y.O.G.
      Performance optimisation of deep learning models using majority voting algorithm for brain tumour classification.
      ].
      The results of using deep learning networks for the classification and detection of brain tumors using magnetic resonance images are promising and demonstrate the high effectiveness of these networks. However, the potential of these strategies has only recently been explored and many configurations may affect the performance of AI [
      • Waring J.
      • Lindvall C.
      • Umeton Y.R.
      Automated machine learning: review of the state-of-the-art and opportunities for healthcare.
      ]. As a result of this experimental framework, this work shows the following contributions:
      • We developed a new architecture based on attention models, like the Transformer network, which we call Cross-Transformer.
      • An overview of artificial intelligence systems in detection and classification was performed.
      • Seven novel deep-learning networks were compared for brain tumor classification.
      • Seven novel deep-learning networks were compared for brain tumor detection. Additionally, an influence assessment of data augmentation and learning transfer has been carried out.
      • The experiment was repeated for all seven novel networks. Nevertheless, the three most common acquisition sequences comparative analysis was performed. Moreover, we included the novel architecture we named Cross-Transformer, together with the seven networks.

      2. Materials and methods

      2.1 Dataset

      As previously mentioned, Magnetic Resonance Imaging (MRI) is a widely accepted and reliable method for identifying brain abnormalities due to its ability to differentiate between various structures and tissues within the brain. Therefore, this paper addresses only three principal sequences: T1WI, T1-Gd, and FLAIR. Images were collected from multiple centers and institutions to ensure that the MRI data used is diverse. The datasets used for this study are described in detail in Table 1 and Fig. 1, which also show examples of various images obtained from the three datasets: The Brain Tumor Dataset (BTD), Magnetic Resonance Imaging Dataset (MRI-D), and The Cancer Genome Atlas Low-Grade Glioma database (TCGA-LGG). For details regarding the pre-processing techniques implemented on the images, please refer to the supplementary material.
      Table 1Dataset used for training convolutional neural networks for brain tumor detection.
      DatasetSubjectsSequencesSlicesClassesImages per class
      BTD233T1-GdAxial, coronal and sagittalMeningioma708
      Glioma1426
      Pituitary930
      MRI-D253T1WIAxialTumors155
      Not tumors98
      TCGA-LGG110T1W1, T1-Gd, FLAIRAxialTumors1373
      Not tumors2556
      Fig. 1
      Fig. 1Samples of A) the three types of brain tumors in the BTD database, B) the MRI-D database with the two classes: tumors and non-tumors, and C) the TCGA-LGG database with the three types of sequences and in the two classes: tumors and non-tumors.

      2.2 Performance evaluation metrics

      The performance of the networks in detecting or classifying brain tumors was evaluated by computing F_1 score, accuracy, sensitivity, specificity, and precision. All these metrics were expressed mathematically as equations from (1) to (5) in Table 2.
      Table 2Performance evaluation metrics.
      MetricEquation
      Accuracy
      • Šimundić A.-M.
      Measures of diagnostic accuracy: basic definitions.
      ACC=TP+TNTP+TN+FP+FN


      (1)
      F1 score

      D.M.W. Powers, Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation, oct. 2020.

      F1=2TP2TP+FP+FN


      (2)
      Sensitivity or Recall

      D.M.W. Powers, Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation, oct. 2020.

      ,
      • Trevethan R.
      Sensitivity, specificity, and predictive values: foundations, pliabilities, and pitfalls in research and practice.
      SE=TPTP+FN


      (3)
      Specificity
      • Šimundić A.-M.
      Measures of diagnostic accuracy: basic definitions.
      ,
      • Trevethan R.
      Sensitivity, specificity, and predictive values: foundations, pliabilities, and pitfalls in research and practice.
      SP=TNTN+FP


      (4)
      Precision

      D.M.W. Powers, Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation, oct. 2020.

      Pr=TPTP+FP


      (5)
      Where, the terms TP, TN, FP, and FN are the true positives, true negatives, false positives, and false negatives, respectively.

      2.3 Experimental design

      The initial data sets were divided into two sets with 80% and 20% proportions for training and testing, respectively. For an understanding of the neural networks utilized in this study and their key attributes, please refer to the supplementary material. All networks were trained using the following hyperparameters:
      • Loss function: Categorical cross-entropy.
      • Optimizer: Adadelta.
      • Epochs: 50
      • Validation: 10 k-folds cross-validation.
      • Number of repeated runs per fold: 3
      • Batch size: 4
      • Initialization of weights: Uniform Glorot.
      • Bias initialization: Zeros.
      Three experiments were conducted to classify and detect brain tumors using MRI data and the most seven novel CNNs described before. In the first experiment, the BTD dataset was used (see Table 1) to classify tumors into three types: Meningioma, glioma, and pituitary. The second experiment focused on detecting brain tumors, i.e. determining the presence or absence of a tumor in an MRI. In this experiment, transfer learning and data augmentation were applied because the amount of data was low. Four conditions were used: no method, transfer learning, data augmentation, and a combination of transfer learning plus data augmentation. In the experiment, the MRI-D database was used (see Table 1). Finally, the detection experiment was repeated on the TCGA-LGG database, using three types of acquisition sequences: T1WI, FLAIR, T1-Gd, and the proposed new network called Cross-Transformer (see supplementary material) was included. The performance of each training was evaluated using metrics such as accuracy, sensitivity, specificity, and F1 score. The results were compared using the nonparametric Kruskal-Wallis test. The architectures were modeled using Python and libraries such as Keras and TensorFlow. The experiments were run on the Colab platform using a Tesla T4 GPU and 25 GB of RAM.

      3. Results

      3.1 Tumor classification – BTD dataset

      The deep learning models used for brain tumor classification demonstrated strong performance, achieving F1 scores above 90 % for most networks, as presented in Table 3. The results show that all models perform well on this task, with F1 scores ranging from 68.50 to 95.39. InceptionResNetV2 appears to be the best-performing model in terms of F1 score and accuracy, while EfficientNetB7 has the highest precision.
      Table 3Maximum scores achieved by the seven DL neural networks on the test data. BTD data.
      NetworkF1_scoreAccuracySensitivitySpecificityPrecision
      InceptionResNetV295,3997,2296,1798,1597,67
      InceptionV394,8596,8997,8197,4394,80
      DenseNet12194,8296,8996,7297,6696,55
      Xception94,5996,7396,7297,8794,14
      ResNet50V293,1195,9196,1798,7293,89
      VGG1974,0476,7682,17100,0089,29
      EfficientNetB768,5076,9296,15100,00100,00
      Based on the results presented in Table 4, the deep learning models used for brain tumor classification demonstrated high effectiveness in identifying all three types of tumors with high accuracy, specificity, and precision values close to 100 %. Pituitary tumors achieved the highest scores, with an accuracy approximately 3 % higher than gliomas and 5 % more accurate than meningiomas.
      Table 4Maximum scores achieved by the seven DL neural networks as a function of the three classes.
      ClassF1_scoreAccuracySensitivitySpecificityPrecision
      Pituitary95,3997,2297,81, 8100,00100,00
      Glioma93,5993,9496,15100,0097,67
      Meningioma82,7192,1485,92100,00100,00
      As shown in Table 5, the Kruskal-Wallis test generates a p-value between the different neural network’s F1 scores. In most cases, the p-value is below the significance level (α = 0.05), meaning the networks have statistically different distributions. To highlight, InceptionResNetV2 and DenseNet121 have a p-value of 0.77, this can be deduced by the box and whisker plots in Fig. 2 for accuracy, sensitivity, and specificity which show similar ranges for both networks, particularly in the interquartile ranges.
      Table 5P-value evaluated between the seven different neural networks using the Kruskal-Wallis test. Classification of the three types of tumors.
      p-value
      Networks1234567
      InceptionResNetV211,000,000,770,010,000,000,00
      InceptionV320,001,000,000,000,190,000,00
      DenseNet12130,770,001,000,000,000,000,00
      Xception40,010,000,001,000,000,000,00
      ResNet50V250,000,190,000,001,000,000,00
      VGG1960,000,000,000,000,001,000,11
      EfficientNetB770,000,000,000,000,000,111,00
      Fig. 2
      Fig. 2Score distribution generated by the different training evaluated with the test data. A) accuracy, B) sensitivity, and C) specificity. D) metrics as a function of tumor type for the best performing network (InceptionResNetv2).
      The performance of InceptionResNetV2 in tumor classification is evaluated in Fig. 2D using the accuracy, sensitivity, and specificity metrics. The plots show that all metrics are close to 90 % for pituitary tumors, except for sensitivity for meningioma, which is significantly lower compared to the accuracy and specificity metrics. Additionally, pituitary tumors exhibit more homogenous distributions compared to glioma and meningioma.
      The training and validation results for InceptionResNetv2 are presented in Fig. 3, which includes the loss function and accuracy metric. The curves display the average of multiple runs with a 95 % error band. The validation and training curves demonstrate similar patterns, indicating good generalization ability and low or no overtraining. In Fig. 3A, the accuracy values are around 0.95, consistent with the findings in Fig. 2A. Fig. 3B shows the loss values close to 0.1. It's worth noting that the training and validation curves intersect at epoch 35, suggesting that the model could achieve high performance with fewer epochs.
      Fig. 3
      Fig. 3Average training and 95 % error bands for the best performing network (InceptionResNetv2). A) accuracy as a function of epochs and B) loss as a function of epochs with training and validation data.

      3.2 Tumor detection – MRI-D dataset

      Table 6 reports the accuracy and F1 score of seven networks trained under four different conditions (see supplementary material): training from scratch (N), transfer learning (T), data augmentation (D), and both transfer learning and data augmentation (T&D). The table shows that all networks achieved high performance, with scores over 90 %. The F1 score improved by 3.4 % for training from zeros and by 1 % for data augmentation, while transfer learning led to a 6 % increase in accuracy compared to training from zeros. The networks with the highest peak performance were DenseNet121, InceptionV3, and VGG19, while Xception had the lowest scores. Fig. 4.
      Table 6Maximum scores achieved by the seven DL neural networks in detection under the four training conditions.
      MethodsMetricInception ResNetV2InceptionV3XceptionResNet50V2DenseNet121VGG19Efficient NetB7
      T&DF1_score90,3290,3289,2392,0693,3393,7591,80
      D89,5587,5089,5584,5185,2975,6176,54
      T90,3293,5588,5787,5089,2391,8085,25
      N86,9684,0687,3287,3284,0675,6176,54
      T&DAccuracy88,2488,2486,2790,2092,1692,1690,20
      D86,2784,3186,2778,4380,3960,7862,75
      T88,2492,1684,3184,3186,2790,2082,35
      N82,3580,3982,3582,3578,4360,7862,75
      T: Transfer learning, D: Data augmentation, and N: None.
      Fig. 4
      Fig. 4Scores distribution generated by the different training evaluated with the test data. A) Accuracy, B) Sensitivity, C) Specificity, and D) F1 score. The distributions are shown for the four training conditions, i.e., without any strategy, learning transfer, data augmentation, and combining learning transfer and data augmentation. MRI-D dataset.
      In Table 7, the F1 score and precision metrics are presented for the two classes, tumor, and not-tumor. The results demonstrate that the combination of data augmentation and learning transfer significantly enhances network performance, with the latter being more effective than the former in terms of the F1 score. Specifically, the fourth training condition (T&D) resulted in the highest scores for the tumor class, achieving a 3 % improvement in class differentiation compared to training without any strategy, which resulted in an 11 % difference between classes. The maximum values for both classes exhibited similar behavior, except accuracy, which showed an increase of up to 9 % when compared to the T&D and N training conditions.
      Table 7Maximum scores achieved by the seven DL neural networks as a function of the two classes and the four training conditions.
      MethodClassF1_scoreAccuracy
      Transfer learning & Data augmentationTumor93,7592,16
      Not tumor90,4892,16
      Data augmentationTumor89,5586,27
      Not tumor81,0886,27
      Transfer learningTumor93,5592,16
      Not tumor90,0092,16
      NoneTumor87,3282,35
      Not tumor76,1982,35
      Table 8 presents the p-values calculated between the convolutional networks. Statistically, InceptionResNetv2 was found to be significantly different from all the other networks, with Inceptionv3 having the closest distribution at a significance level of 0.02. In contrast, ResNet50V2 has the distribution of scores with the largest number of similar networks.
      Table 8P-value evaluated among the seven neural networks using the Kruskal-Wallis test in tumor detection. The statistic was calculated with all the training conditions scores.
      p-value - (Comparison with the scores of all training sessions)
      Network1234567
      InceptionResNetV211,000,020,000,000,000,000,00
      InceptionV320,021,000,510,290,010,000,00
      Xception30,000,511,000,610,020,010,00
      ResNet50V240,000,290,611,000,100,010,00
      DenseNet12150,000,010,020,101,000,170,01
      VGG1960,000,000,010,010,171,000,17
      EfficientNetB770,000,000,000,000,010,171,00
      In summary, Table 9 presents the p-values computed for different training conditions. For the InceptionResNetV2 network, the scores suggest that all training conditions are statistically different, except for training with data augmentation and learning transfer, which are likely to produce the same results. Additionally, the combination of both is statistically different for all networks except for the Inceptionv3 network.
      Table 9P-value evaluated between the four training conditions for each of the seven convolutional neural networks using the Kruskal-Wallis test. Tumor detection.
      p-value
      Statistician evaluated between:T & DT & DT & DDDT
      TDNTNN
      InceptionResNetV20,000,000,000,680,000,01
      InceptionV30,210,000,000,000,020,00
      Xception0,000,190,000,030,010,93
      ResNet50V20,000,000,000,170,080,97
      DenseNet1210,000,000,000,890,040,09
      VGG190,030,000,000,001,000,00
      EfficientNetB70,000,000,000,000,890,00
      T: Transfer learning, D: Data augmentation, and N: None
      Fig. 5 illustrates the metric’s distribution used for detecting the two classes. Tumor classes typically showed a higher sensitivity than the specificity, i.e., in general, it is easier to detect true positives than true negatives, which could imply an increase in the false-positive detection rate.
      Fig. 5
      Fig. 5Metrics as a function of the two classes in tumor detection with MRI-D data.
      In Fig. 6, the training curves for the InceptionResNetV2 (T&D trained) network are shown as a function of the number of epochs. On the average loss curve Fig. 6A, show that the network reached a loss value of less than 0.1 around epoch 30, and the 95 % error band for each epoch is significantly reduced. The validation data followed a similar trajectory but showed signs of potential over-fitting. In Fig. 6B the training and validation accuracy curves presented a similar shape but diverged at the end, indicating overtraining of the model.
      Fig. 6
      Fig. 6Average training and 95 % error bands for the best performing network (InceptionResNetV2). A) Accuracy as a function of epochs and B) Loss as a function of epochs with the training and validation data. Tumor detection experiment with transfer learning and data augmentation with the MRI-D database.

      3.3 Tumor detection – TCGA-LGG dataset

      Table 10 summarizes the maximum scores for each of the eight neural networks, including the Cross-Transformer, and the three image acquisition sequences. The InceptionResNetV2 network with the FLAIR sequence had the highest score. The FLAIR sequence was found to be the most successful in detecting tumors for most of the neural networks and this trend held for both accuracy and F1 score metrics. At the other extreme is the VGG19 network with the acquisition sequence T1-Gd.
      Table 10Maximum scores achieved by the eight DL neural networks in detection, with the three different image acquisition sequences.
      SequencesMetricInception ResNetV2Cross- TransformerXceptionDenseNet121InceptionV3ResNet50V2Efficient NetB7VGG19
      T1WIF1_score89,7282,8989,3087,9486,1486,3579,6379,69
      FLAIR93,4584,7691,9591,9988,8587,2980,1379,44
      T1-Gd89,4282,8488,4586,8384,8085,8579,2678,83
      T1WIAccuracy86,5388,0685,9084,8882,2181,8371,0370,65
      FLAIR91,3689,5889,3389,2084,6383,6170,0168,36
      T1-Gd85,9088,3185,1383,3580,8181,0768,7465,06
      Similarly, Table 11 presents the maximum scores based on the two classes and the three image acquisition sequences. In particular, the results showed that the tumor class exhibited lower scores than the non-tumor class. The experiment results supported the superiority of the FLAIR sequence followed by T1WI and T1-Gd as the preferred imaging technique.
      Table 11Maximum scores achieved by the eight DL neural networks as a function of the two classes and the three image acquisition sequences.
      MethodClassF1_scoreAccuracy
      T1WITumor81,4686,53
      Not tumor89,7288,06
      FLAIRTumor87,3191,36
      Not tumor93,4591,36
      T1-GdTumor79,9385,90
      Not tumor89,4288,31
      As previously stated, the InceptionResNetV2 model demonstrated the highest efficacy in identifying brain tumors. Furthermore, the network showed statistically significant differences compared to most of the other networks. The p-value generated by the Kruskal-Wallis test suggests that InceptionResNetV2 is similar to Xception with a significance level of 0.07 (see Table 12).
      Table 12P-value evaluated among the eight neural networks using the Kruskal-Wallis test in tumor detection. The statistic was calculated with all the scores of the three image acquisition sequences.
      p-value - (Data of all sequences)
      Network12345678
      InceptionResNetV211,000,000,070,000,000,000,000,00
      Cross-Transformer20,001,000,000,000,520,790,000,00
      Xception30,070,001,000,060,000,000,000,00
      DenseNet12140,000,000,061,000,000,000,000,00
      InceptionV350,000,520,000,001,000,090,000,00
      ResNet50V260,000,790,000,000,091,000,000,00
      EfficientNetB770,000,000,000,000,000,001,000,01
      VGG1980,000,000,000,000,000,000,011,00
      Table 13 displays the Kruskal-Wallis p-value for each network in comparison to various imaging sequences. In most networks, the FLAIR sequence demonstrated a significant difference except for the EfficientNetB7 and VGG19 networks. However, for the T1WI and T1-Gd sequences, no significant differences were observed concerning the other networks. Notably, only the InceptionResNetV2 and DenseNet121 networks exhibited p-values that were below the significance level.
      Table 13P-value evaluated between the three image acquisition sequences for each of the eight ANNs using the Kruskal-Wallis test. Tumor detection.
      p-value (Sequences)
      Statistician evaluated between:FLAIRFLAIRT1WI
      T1WIT1-GdT1-Gd
      InceptionResNetV20,000,000,04
      Cross-Transformer0,000,000,67
      Xception0,000,000,25
      DenseNet1210,000,000,04
      InceptionV30,000,000,13
      ResNet50V20,030,000,35
      EfficientNetB70,940,220,24
      VGG190,170,600,07
      Fig. 7 demonstrates the distribution of scores by networks and image acquisition sequences. The box-and-whisker plots in Fig. 7A reveal that the FLAIR sequence had a higher distribution than the other two sequences in each network, signifying that it has higher accuracy in detecting brain tumors. Moreover, sensitivity, specificity, and F1 score metrics displayed a similar pattern to the accuracy.
      Fig. 7
      Fig. 7Scores distribution generated by the different training evaluated with the test data. A) Accuracy, B) Sensitivity, C) Specificity, and D) F1 score. The distributions are shown for the three image acquisition sequences, i.e., T1WI, FLAIR, and T1-Gd. TCGA-LGG dataset.
      In Fig. 8 for InceptionResNetV2 the accuracy of both the training and validation curves increased with time, convergent above 0.9. Otherwise, the loss curves (see Fig. 8B) decreased for training and validation, converging to approximately 0.5. In this case, the validation curves were better than the training curves; therefore, a larger number of epochs would be requiring improving the model’s performance.
      Fig. 8
      Fig. 8Average training and 95 % error bands for the best performing network InceptionResNetV2. A) Accuracy as a function of epochs and B) Loss as a function of epochs, with training and validation data. Tumor detection experiment with the FLAIR sequence on TCGA-LGG dataset.
      Similarly, Fig. 9 depicts the accuracy metric training and model loss for Cross-Transformer, where the training and validation curves are illustrated. The proposed model shows excellent performance, with the accuracy metric increasing to above 0.9 as the number of epochs increases. However, the validation score was lower than the training score after epoch 40, indicating model overtraining.
      Fig. 9
      Fig. 9Average training and 95 % error bands for the proposed network (Cross-Transformer). A) Accuracy as a function of epochs and B) Loss as a function of epochs, with training and validation data. Tumor detection experiment with the FLAIR sequence on TCGA-LGG dataset.
      In Fig. 9 Although the metric reached higher values during training, the proposed model's distributions were partially lower than those of the InceptionResNetv2 network. Additionally, the model loss was low values, however, the validation curve returned values that were higher than the training curve after epoch 25, again showing partial overtraining.
      As a final point, Fig. 10 shows the eight architectures average training time implemented in this study. The results indicate that the proposed model (Cross-Transformer) is highly effective. In other words, on average, the proposed model took approximately 18 min to train 50 epochs with almost three thousand images. In comparison, training InceptionResNetV2 took 5x longer and efficientNetB7 took 9x longer under the same conditions.
      Fig. 10
      Fig. 10The average training time of the 8 architectures with 95 % confidence interval (black lines). Training.
      The training time was checked with the Kruskal-Wallis test. The p-values are shown in Table 14, which demonstrates that the proposed model (Cross-Transformer) showed statistically significant differences from the other seven models. The model with the highest p-value was the ResNet50V2 network, which reached a significance level of 0.09.
      Table 14P-value evaluated in between the eight neural networks using the Kruskal-Wallis test at training time for tumor detection. Statistics estimate by all scores from the three image acquisition sequences.
      p-value - (Times)
      Network12345678
      Cross-Transformer11,000,040,020,000,000,000,000,00
      ResNet50V220,041,000,090,000,000,000,000,00
      InceptionV330,020,091,000,000,000,000,000,00
      VGG1940,000,000,001,000,000,000,000,01
      DenseNet12150,000,000,000,001,000,060,000,00
      Xception60,000,000,000,000,061,000,070,00
      InceptionResNetV270,000,000,000,000,000,071,000,00
      EfficientNetB780,000,000,000,010,000,000,001,00

      4. Discussion

      The primary focus of this research is to investigate the aggressiveness of cancer, particularly in the context of brain tumors, and their potential to cause severe complications, regardless of whether they are malignant or benign. To achieve this goal, various datasets and deep learning neural networks were employed to detect and classify different types of brain tumors, such as meningioma, glioma, and pituitary tumors. The study utilized several classification networks including ResNet50V2, EfficientNetB7, InceptionResNetV2, InceptionV3, VGG19, Xception, and DenseNet121. Additionally, the study evaluated the acquisition sequences of magnetic resonance imaging (MRI) scans, including T1WI, FLAIR, and T1-Gd, in order to improve the accuracy of the classification process. The ultimate objective of this research is to detect brain tumors quickly and effectively to prevent potentially serious complications.
      The statement is discussing the results of a study in which the InceptionResNetV2 network was used to predict pituitary tumors. It was found that the network was highly accurate, with accuracy scores above 97 %. The results were surprising for two main reasons: firstly, the dataset of pituitary tumors was smaller compared to that of glioma tumors, and secondly, pituitary tumors are typically smaller in size and harder to detect visually (see Fig. 1). However, the pituitary tumors exhibited homogeneous behavior which led to high classification scores. The study used a dataset of 1426 gliomas, 930 pituitary tumors, and 708 meningiomas. Despite the large dataset, the models performed better in detecting pituitary tumors than gliomas. The results suggest that gliomas may be harder to detect due to their physiological characteristics in MRI images. The study also indicates that networks can identify some pathologies more easily even with smaller dataset sizes. Further studies are needed to understand why pituitary tumors are easier for DL networks to detect.
      In the case of meningiomas, they presented performance metrics below the other two types of tumors (low sensitivity and F1 score), which is in line with the research conducted by Swati et al. where it was established that meningioma presented an F1 score of 88.88 % in contrast to 94.52 % and 91.80 % for glioma and pituitary, respectively [
      • Swati Z.N.K.
      • et al.
      Brain tumor classification for MR images using transfer learning and fine-tuning.
      ]. The scores showed the high DL models effectiveness and were the highest in the state of the art at the time, as our research achieved F1 scores of 82.71, 93.59, and 95.39 in the classification of meningiomas, glioma, and pituitary, respectively (See Table 4). Furthermore, this confirms the need to explore new DL networks, because Swati et al. only focused on the AlexNet and VGG19 networks.
      Secondly, we investigated the diagnosis of brain tumors, i.e., whether the MRI images revealed any physiological features characteristic of brain tumors. The experiment was performed on the MRI-D database. The results confirmed that data augmentation strategies using transfer learning significantly improved the model’s performance, increasing accuracy by up to 6 % (InceptionResNetV2 network augmentation). On the other hand, training with data augmentation also improved the network’s performance, but to a lesser extent than with transfer learning. However, this does not assume that data augmentation alone is sufficient to improve network performance, since, as Sugimori et al. showed, data augmentation is effective on smaller data sets [
      • Sugimori H.
      • Hamaguchi H.
      • Fujiwara T.
      • Ishizaka Y.K.
      Classification of type of brain magnetic resonance images with deep learning technique.
      ].
      Moreover, the scores indicate that the difference between the metrics in each class has decreased significantly. For example, the F1 score went from a difference of approximately 11–3 %. As a result, the reduction in the score and the performance improvement are partially a transfer of learning results, since using only this strategy alone produced similar results to those when using data augmentation, showing only a slight improvement of 0.2 %. Although this last result did not significantly increase the model’s performance, this could be because only the geometric flipping images transformation was used, therefore, future work could explore different data augmentation strategies and the impact they have on the classification and detection of brain tumors.
      Thirdly, the brain tumor detection experiment was replicated on the TCGA-LGG database. However, with this database, a comparison scan was performed between the three image acquisition sequences: T1WI, FLAIR, and T1-Gd. A statistical comparison between the values generated by the networks showed that the FLAIR acquisition sequence is statistically different from the other two acquisition sequences. Additionally, since the FLAIR sequence presented the distribution of scores with the highest values, it is possible to conclude that this sequence is ideal for detecting tumors in low-grade gliomas. In this sense, the result opens possibilities for future work, for example, a new multimodal network could be built with the different image acquisition sequences, but giving more relevance to the FLAIR sequence, which could allow the network to obtain additional information from the other sequences, increasing its performance.
      On the other hand, some networks had similar distributions of scores. The network similarity was statistically validated, and it was shown that some networks are not statistically different. In other words, using networks with a p-value greater than 0.05 would yield the same results, including the Cross-Transformer with the InceptionV3 and the ResNet50V2 networks. Even though the networks are not significantly different in terms of performance metrics, the Cross-Transformer is more efficient in training time, achieving the same results in up to half the time of the other two networks (see Fig. 10). The time difference is significant and even below the significance level of 0.04 for the RestNet50v2.
      In this paper, we have evaluated the performance of state-of-the-art neural networks in the tasks of tumor type classification and tumor detection under different configurations. The results indicate a promising direction towards more accurate automatic systems that could potentially have clinical validity if evaluated on data from multiple centers and compared with labels from radiologists or expert practitioners. Moreover, the performance metrics are encouraging, and the experiments showed some observations to be considered for the implementation of the different artificial intelligence methods. However, the study has some limitations. Firstly, access to large amounts of data remains a concern. Even though the experiments were conducted with several databases, there was a great deal of variation among the sets. For example, not all images have the same spatial resolution, i.e., they are of different sizes. As a result, the image acquisition sequences vary greatly, although, in the case of brain tumors, the FLAIR sequence appears most suitable for tumor detection, based on the results obtained. Additionally, the largest and most common databases were utilized in this study. TCGA-LGG, one of the most widely used databases of brain tumors on MR imaging, was used as an example. However, the database is limited. Although the set consists of 3929 MR images, the set itself is misleading since all images were generated using only 110 subjects. This is to be expected in most MRI implementations, as it represents a computationally excessive load. It is even one of the main drawbacks limiting the implementation of 3D models. The databases used were based on 2D images that do not specify the selection criteria for the slices of the 3D volume.

      5. Conclusion

      An analysis of brain tumor classification and detection on magnetic resonance imaging has been performed using different datasets. Initially, we evaluated the seven most recent neural networks for the classification of meningioma, glioma, and pituitary-type tumors (BTD database). The results indicate that the neural networks have excellent detection and classification algorithms. Firstly, the InceptionResNetV2 network achieved up to 97 % accuracy, outperforming the networks of InceptionV3, DenseNet121, Xception, ResNet50V2, VGG19, and EfficientNetB7 with a significance level of 0.07 in the Kruskal-Wallis test. Additionally, InceptionResNetV2 provided the most homogeneous distribution, ensuring its high effectiveness.
      Moreover, it was noted that pituitary tumors are distinguished from meningiomas and gliomas, even though the former has a lower number of images. After this, the MRI-D dataset was used to detect brain tumors by incorporating transfer learning and data augmentation. Together, these two strategies increased the accuracy of InceptionResNetV2 by up to 6 % over the model trained from zeros. Further, such a combination was statistically different for networks trained under other conditions, such as training with only learning transfer or only data augmentation. In fact, out of the 7 networks cited above, only InceptionV3 and Xception were statistically significant at the 0.05 level. Finally, the detection was replicated on TCGA-LGG data by examining the T1WI, FLAIR, and T1-Gd acquisition sequences. A new network was introduced to the experiment, referred to as the cross-transformer. Results showed that the FLAIR sequence is more suitable for brain tumor detection, with a significance level of less than 0.03 in six of the eight networks, except for the EfficientNetB7 and VGG19 networks. Additionally, it was shown that the cross-transformer achieved accuracy values close to 90 % while using a training-time fraction of the second-fastest network, ResNet50V2.

      Funding statement

      No funding was received for this work.

      CRediT authorship contribution statement

      Conceptualization, Anaya-Isaza and Mera-Jiménez.; Methodology, Anaya-Isaza and Mera-Jiménez; Software, Anaya-Isaza and Mera-Jiménez; Validation, Mera-Jiménez; Formal analysis, Anaya-Isaza.; Investigation, Anaya-Isaza.; Resources, Anaya-Isaza and Mera-Jiménez.; Writing – original draft, Anaya-Isaza, Mera-Jiménez, Verdugo-Alejo and Sarasti-Ramírez; Writing – review & editing, Anaya-Isaza, Verdugo-Alejo, Mera-Jiménez and Sarasti-Ramírez; Visualization, Anaya-Isaza and Mera-Jiménez.; Supervision, Anaya-Isaza and Mera-Jiménez.; Project administration, Anaya-Isaza and Mera-Jiménez.; Funding acquisition, Anaya-Isaza. All authors have read and agreed to the published version of the manuscript.

      Declaration of Competing Interest

      The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

      Acknowledgement

      This research was supported by the research division of INDIGO Technologies (https://indigo.tech/). The results published here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.

      Appendix

      Hyperparameters

      Adadelta: A stochastic gradient descent method based on the adaptive learning rate per dimension for the optimization of training parameters, i.e., to adjust model parameters or weights during training. [

      M.D. Zeiler, ADADELTA: an Adaptive Learning Rate Method, dic. 2012.

      ]
      Batch size: The number of samples processed before updating the model weights [

      M. Li, T. Zhang, Y. Chen, y A. J. Smola, Efficient mini-batch training for stochastic optimization, in: Proceedings of the Twentieth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Ago. 2014, 661–670. doi: 10.1145/2623330.2623612.

      ]. The larger the batch size, the faster the training, but it requires more RAM.
      Bias initialization: Values taken by the bias (bj) of the model before the model training is started (see Eq. (1)).
      Categorical cross-entropy: Loss function based on the logarithmic difference (see Eq. (6)) between two probability distributions of random data or sets of events. Its use focuses on the set elements classification [

      Yi-de Ma, Qing Liu, y Zhi-bai Quan, Automated image segmentation using improved PCNN model based on cross-entropy, in: Proceedings of the International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004., 2004, 743–746. doi: 10.1109/ISIMP.2004.1434171.

      ]. In the case of images, this principle can be applied to image pixels, where each element is cataloged into two possible categories: background and object of interest.
      LBCEy,yˆ=ylogyˆ+1ylog1yˆ
      (6)


      Here y are the actual labels and yˆ are the values predicted by the model.
      Cross-validation: It is a technique used to evaluate the performance of artificial intelligence networks, guaranteeing the partition independence between training and validation data. The method consists of dividing the data set into a given number of subsets. One subset would be left for validation and trained with the remaining subsets. The process is repeated and in each run a different subset is taken for validation [
      • Belyadi H.
      • Haghighat Y.A.
      Model evaluation.
      ].
      Epochs: Number of times the model training is repeated with the whole data set.
      Loss function: A function that determines the difference between the actual data and the data predicted by the network or model [
      • Wang Q.
      • Ma Y.
      • Zhao K.
      • Tian Y.Y.
      A Comprehensive Survey of Loss Functions in Machine Learning.
      ].
      Optimizer: A way in which the gradient (or a gradient variant) of the training parameters is calculated to adjust these values towards values that optimize or reduce the loss function [

      S. Ruder, An Overview of Gradient Descent Optimization Algorithms, Sep. 2016.

      ].
      Performance metrics: Functions to monitor and measure model performance based on actual and predicted model values [
      • Kotu V.
      • Deshpande Y.B.
      Model evaluation.
      ].
      Training parameters: Coefficients that accompany the mathematical models’ operations and that are iteratively adjusted in the training process (e.g., weights and bias).
      Initialization of weights: Values taken by the training model parameters or weights before the model training is started. In the case of a convolutional network, the weights are those that make up the convolutional filter Kij (see Eq. (1)).

      Glossary

      AI: Artificial Intelligence.
      ANNs: Artificial Neural Networks.
      BTD: Brain Tumor Dataset.
      CNNs: Convolutional Neural Networks.
      CNS: Central Nervous System.
      CT Computed Tomography.
      DBN: Depp Belief Network.
      DL: Deep Learning.
      ESMF: Extended Set-Membership Filter.
      FDR: False Detection Rate.
      FLAIR: Fluid-Attenuated Inversion Recovery.
      FN: False Negatives.
      FNR: False Negative Rate.
      FP: False Positives.
      FPR: False Positive Rate.
      GLCM: Gray-Level Co-occurrence Matrix.
      GLRM: Gray-Level Run-length Matrix.
      kNN: k Nearest Neighbor.
      MCC: Matthew's Correlation Coefficient.
      MPSO: Modified Particle Swarm Optimization.
      MRI: Magnetic Resonance Imaging.
      MRI-D: Magnetic Resonance Imaging Dataset.
      PCA: Principal Component Analysis.
      RGB: Red, Green, and Blue colors.
      SGHO: Swarm-based Grasshopper Optimization.
      TCIA: Cancer Imaging Archive.
      TCGA-LGG: The Cancer Genome Atlas Low Grade Glioma.
      TN: True Negatives.
      TP: True Positives.
      T1WI: T1-weighted images.
      T1-Gd: Pre-contrast T1, Post-contrast.
      WGAN: Wasserstein Adversarial Generative Network.
      WHO: World Health Organization.

      Appendix C. Supplementary material

      References

      1. American Cancer Society, Cancer Facts & Figures 2020, 1–76, 2020.
      2. Cancer, https://www.who.int/en/news-room/fact-sheets/detail/cancer (accedido 17 de noviembre de 2020).
        • Mack T.M.
        What a cancer is.
        Cancers in the Urban Environment. Elsevier, 2021: 5-8https://doi.org/10.1016/B978-0-12-811745-3.00003-3
        • Ray S.D.
        • Yang N.
        • Pandey S.
        • Bello N.T.
        • Gray Y.J.P.
        Apoptosis.
        en Reference Module in Biomedical Sciences. Elsevier, 2019https://doi.org/10.1016/B978-0-12-801238-3.62145-1
        • Foster J.R.
        Introduction to Neoplasia.
        Comprehensive Toxicology. Elsevier, 2018: 1-10https://doi.org/10.1016/B978-0-12-801238-3.02217-0
        • Yokota J.
        Tumor progression and metastasis.
        Carcinogenesis. 2000; vol. 21: 497-503https://doi.org/10.1093/carcin/21.3.497
        • Ost D.E.
        • Gould Y.M.K.
        Decision making in patients with pulmonary nodules.
        Am. J. Respir. Crit. Care Med. 2012; 185: 363-372https://doi.org/10.1164/rccm.201104-0679CI
        • Auvinen A.
        • Hakama Y.M.
        Cancer screening: theory and applications.
        International Encyclopedia of Public Health. Elsevier, 2017: 389-405https://doi.org/10.1016/B978-0-12-803678-5.00050-3
        • Huang R.
        • Boltze J.
        • Li Y.S.
        Strategies for improved intra-arterial treatments targeting brain tumors: a systematic review.
        Front. Oncol. 2020; 10https://doi.org/10.3389/fonc.2020.01443
        • Moon S.-J.
        • Ginat D.T.
        • Tubbs R.S.
        • Moisi Y.M.D.
        Tumors of the brain.
        en Central Nervous System Cancer Rehabilitation. Elsevier, 2019: 27-34https://doi.org/10.1016/B978-0-323-54829-8.00004-4
        • Sontheimer H.
        Brain tumors.
        en Diseases of the Nervous System. Elsevier, 2021: 207-233https://doi.org/10.1016/B978-0-12-821228-8.00009-3
        • Reynoso-Noverón N.
        • Mohar-Betancourt A.
        • Ortiz-Rafael Y.J.
        Epidemiology of brain tumor.
        Pinciples of Neuro-Oncology. Springer International Publishing, Cham2021: 15-25https://doi.org/10.1007/978-3-030-54879-7_2
        • Turner N.
        • Vidovic Y.N.
        Cancer health concerns.
        Reference Module in Food Science. Elsevier, 2018https://doi.org/10.1016/B978-0-08-100596-5.22577-8
        • Troyanskaya O.
        • Trajanoski Z.
        • Carpenter A.
        • Thrun S.
        • Razavian N.
        • Oliver Y.N.
        Artificial intelligence and cancer.
        Nat. Cancer. 2020; 1: 149-152https://doi.org/10.1038/s43018-020-0034-6
        • Bi W.L.
        • et al.
        Artificial intelligence in cancer imaging: Clinical challenges and applications.
        CA Cancer J. Clin. 2019; caac.21552https://doi.org/10.3322/caac.21552
        • Hosny A.
        • Parmar C.
        • Quackenbush J.
        • Schwartz L.H.
        • Aerts Y.H.J.W.L.
        Artificial intelligence in radiology.
        Nat. Rev. Cancer. 2018; 18: 500-510https://doi.org/10.1038/s41568-018-0016-5
        • Sharif M.I.
        • Li J.P.
        • Naz J.
        • Rashid Y.I.
        A comprehensive review on multi-organs tumor detection based on machine learning.
        Pattern Recognit. Lett. 2020; 131: 30-37https://doi.org/10.1016/j.patrec.2019.12.006
        • Bhatele K.R.
        • Bhadauria Y.S.S.
        Brain structural disorders detection and classification approaches: a review.
        Artif. Intell. Rev. 2020; 53: 3349-3401https://doi.org/10.1007/s10462-019-09766-9
        • Pauli R.
        • Wilson Y.M.
        The basic principles of magnetic resonance imaging.
        Encyclopedia of Behavioral Neuroscience. second ed. Elsevier, 2022: 105-113https://doi.org/10.1016/B978-0-12-819641-0.00108-0
        • Duong M.T.
        • Rauschecker A.M.
        • Mohan Y.S.
        Diverse applications of artificial intelligence in neuroradiology.
        Neuroimaging Clin. N. Am. 2020; 30: 505-516https://doi.org/10.1016/j.nic.2020.07.003
        • Nazir M.
        • Shakil S.
        • Khurshid Y.K.
        Role of deep learning in brain tumor detection and classification (2015 to 2020): A review.
        Comput. Med. Imaging Graph. 2021; 91101940https://doi.org/10.1016/j.compmedimag.2021.101940
        • Işın A.
        • Direkoğlu C.
        • Şah Y.M.
        Review of MRI-based brain tumor image segmentation using deep learning methods.
        Procedia Comput. Sci. 2016; 102: 317-324https://doi.org/10.1016/j.procs.2016.09.407
        • Aggarwal R.
        • et al.
        Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis.
        Npj Digit. Med. 2021; 4: 65https://doi.org/10.1038/s41746-021-00438-z
        • Serte S.
        • Serener A.
        • Al‐Turjman Y.F.
        Deep learning in medical imaging: a brief review.
        Trans. Emerg. Telecommun. Technol. 2020; https://doi.org/10.1002/ett.4080
        • He K.
        • Zhang X.
        • Ren S.
        • Sun Y.J.
        Delving deep into rectifiers: surpassing human-level performance on imagenet classification.
        Proc. IEEE Int. Conf. Comput. Vis. 2015; 2015: 1026-1034https://doi.org/10.1109/ICCV.2015.123
        • Schmidhuber J.
        Deep learning in neural networks: an overview.
        Neural Netw. 2015; 61: 85-117https://doi.org/10.1016/j.neunet.2014.09.003
        • Ravi D.
        • et al.
        Deep learning for health informatics.
        IEEE J. Biomed. Health Inform. 2017; 21: 4-21https://doi.org/10.1109/JBHI.2016.2636665
        • Liu W.
        • Wang Z.
        • Liu X.
        • Zeng N.
        • Liu Y.
        • Alsaadi Y.F.E.
        A survey of deep neural network architectures and their applications.
        Neurocomputing. 2017; 234: 11-26https://doi.org/10.1016/j.neucom.2016.12.038
        • Dong S.
        • Wang P.
        • Abbas Y.K.
        A survey on deep learning and its applications.
        Comput. Sci. Rev. 2021; 40100379https://doi.org/10.1016/j.cosrev.2021.100379
        • Zhang X.
        • Smith N.
        • Webb Y.A.
        Medical imaging.
        Biomedical Information Technology. Elsevier, 2020: 3-49https://doi.org/10.1016/B978-0-12-816034-3.00001-8
        • Deepa G.
        • Mary G.L.R.
        • Karthikeyan A.
        • Rajalakshmi P.
        • Hemavathi K.
        • Dharanisri Y.M.
        Detection of brain tumor using modified particle swarm optimization (MPSO) segmentation via haralick features extraction and subsequent classification by KNN algorithm.
        Mater. Today Proc. 2021; https://doi.org/10.1016/j.matpr.2021.10.475
        • Islam M.K.
        • Ali M.S.
        • Miah M.S.
        • Rahman M.M.
        • Alam M.S.
        • Hossain Y.M.A.
        Brain tumor detection in MR image using superpixels, principal component analysis and template based K-means clustering algorithm.
        Mach. Learn. Appl. 2021; 5100044https://doi.org/10.1016/j.mlwa.2021.100044
        • Bhagat N.
        • Kaur Y.G.
        MRI brain tumor image classification with support vector machine.
        Mater. Today Proc.. 2021; https://doi.org/10.1016/j.matpr.2021.11.368
        • Chandra Joshi R.
        • Mishra R.
        • Gandhi P.
        • Pathak V.K.
        • Burget R.
        • Dutta Y.M.K.
        Ensemble based machine learning approach for prediction of glioma and multi-grade classification.
        Comput. Biol. Med. 2021; 137104829https://doi.org/10.1016/j.compbiomed.2021.104829
        • Sathies Kumar T.
        • Arun C.
        • Ezhumalai Y.P.
        An approach for brain tumor detection using optimal feature selection and optimized deep belief network.
        Biomed. Signal. Process. Control. 2022; 73103440https://doi.org/10.1016/j.bspc.2021.103440
        • Takao H.
        • Amemiya S.
        • Kato S.
        • Yamashita H.
        • Sakamoto N.
        • Abe Y.O.
        Deep-learning single-shot detector for automatic detection of brain metastases with the combined use of contrast-enhanced and non-enhanced computed tomography images.
        Eur. J. Radiol. 2021; 144110015https://doi.org/10.1016/j.ejrad.2021.110015
        • Xiao Y.
        • Wu J.
        • Lin Y.Z.
        Cancer diagnosis using generative adversarial networks based on deep learning from imbalanced data.
        Comput. Biol. Med. 2021; 135104540https://doi.org/10.1016/j.compbiomed.2021.104540
        • Song G.
        • Shan T.
        • Bao M.
        • Liu Y.
        • Zhao Y.
        • Chen Y.B.
        Automatic brain tumour diagnostic method based on a back propagation neural network and an extended set-membership filter.
        Comput. Methods Prog. Biomed. 2021; 208106188https://doi.org/10.1016/j.cmpb.2021.106188
        • Ait Skourt B.
        • El Hassani A.
        • Majda Y.A.
        Mixed-pooling-dropout for convolutional neural network regularization.
        J. King Saud. Univ. Comput. Inf. Sci. 2021; https://doi.org/10.1016/j.jksuci.2021.05.001
        • Cheng J.
        Brain tumor dataset.
        Figshare. 2017; https://doi.org/10.6084/m9.figshare.1512427.v5
        • Ahuja S.
        • Panigrahi B.K.
        • Gandhi Y.T.K.
        Enhanced performance of Dark-Nets for brain tumor classification and segmentation using colormap-based superpixel techniques.
        Mach. Learn. Appl. 2022; 7100212https://doi.org/10.1016/j.mlwa.2021.100212
        • Tandel G.S.
        • Tiwari A.
        • Kakde Y.O.G.
        Performance optimisation of deep learning models using majority voting algorithm for brain tumour classification.
        Comput. Biol. Med. 2021; 135104564https://doi.org/10.1016/j.compbiomed.2021.104564
        • Waring J.
        • Lindvall C.
        • Umeton Y.R.
        Automated machine learning: review of the state-of-the-art and opportunities for healthcare.
        Artif. Intell. Med. 2020; 104101822https://doi.org/10.1016/j.artmed.2020.101822
        • Šimundić A.-M.
        Measures of diagnostic accuracy: basic definitions.
        EJIFCC. 2009; 19: 203-211
      3. D.M.W. Powers, Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation, oct. 2020.

        • Trevethan R.
        Sensitivity, specificity, and predictive values: foundations, pliabilities, and pitfalls in research and practice.
        Front. Public Health. 2017; 5: 307https://doi.org/10.3389/fpubh.2017.00307
        • Swati Z.N.K.
        • et al.
        Brain tumor classification for MR images using transfer learning and fine-tuning.
        Comput. Med. Imaging Graph. 2019; 75: 34-46https://doi.org/10.1016/j.compmedimag.2019.05.001
        • Sugimori H.
        • Hamaguchi H.
        • Fujiwara T.
        • Ishizaka Y.K.
        Classification of type of brain magnetic resonance images with deep learning technique.
        Magn. Reson. Imaging. 2021; 77: 180-185https://doi.org/10.1016/j.mri.2020.12.017
      4. M.D. Zeiler, ADADELTA: an Adaptive Learning Rate Method, dic. 2012.

      5. M. Li, T. Zhang, Y. Chen, y A. J. Smola, Efficient mini-batch training for stochastic optimization, in: Proceedings of the Twentieth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Ago. 2014, 661–670. doi: 10.1145/2623330.2623612.

      6. Yi-de Ma, Qing Liu, y Zhi-bai Quan, Automated image segmentation using improved PCNN model based on cross-entropy, in: Proceedings of the International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004., 2004, 743–746. doi: 10.1109/ISIMP.2004.1434171.

        • Belyadi H.
        • Haghighat Y.A.
        Model evaluation.
        Mach. Learn. Guide Oil Gas. Using Python. 2021; : 349-380https://doi.org/10.1016/B978-0-12-821929-4.00009-3
        • Wang Q.
        • Ma Y.
        • Zhao K.
        • Tian Y.Y.
        A Comprehensive Survey of Loss Functions in Machine Learning.
        Ann. Data Sci. 2020; https://doi.org/10.1007/s40745-020-00253-5
      7. S. Ruder, An Overview of Gradient Descent Optimization Algorithms, Sep. 2016.

        • Kotu V.
        • Deshpande Y.B.
        Model evaluation.
        Data Sci. 2019; : 263-279https://doi.org/10.1016/B978-0-12-814761-0.00008-3