Improving the Accuracy and Performance of Deep Learning Model by Applying Hybrid Grey Wolf Whale Optimizer to P&C Insurance Data

— The insurance industry is based on risk calculations, high profits, and detailed information. The predictive models that insurance companies utilize allow insurance companies to make accurate decisions about the insurance sector. This research focuses on improving the accuracy of predicting customers of Property and Casualty (P&C) insurance. In this study, a reliable quantitative analytical big data method has been developed, and the Hybrid Grey Wolf and Whale Optimization (HGWWO) is utilized with Deep Learning Model for evaluating customer behavior of the customers of P&C insurance. The research discussed the Hybrid Gray Wolf-Whale Optimization algorithm and the steps involved in the optimization process. This paper has presented the details of how to create a Grey Wolf Optimizer, Whale Optimizer and then combining both for initialization, evaluation, and optimization of the relevant P&C insurance dataset to improve the prediction accuracy. We have also compared the performance of the Deep Learning model with a few traditional machine learning models.


I. INTRODUCTION
Various prediction models exist today and are used for predicting credit risk by various financial and insurance corporations. Over the years, many researchers and academics have begun applying artificial intelligence (AI) and Machine learning models to predict various risks insurance corporations face. Research studies have shown that the latest approaches, such as artificial intelligence and machine learning, can play an important role in analyzing huge amounts of data and generating results that can help predict customer outcomes soon. To minimize risk and stay competitive in the market, implementation of the latest technologies has become an essential [1].
It is true that if insurance corporations are not going to analyze various types of risks effectively, then the chances of growth in the market will decline. If risks are not managed or mitigated on time, then there is a huge chance that the insurance corporation can face huge financial loss, and in the worst-case scenario, the corporation might have to close its operations. The insurance industry is based on risk calculations, high profits, and detailed information. The predictive models that insurance companies utilize allow insurance companies to make accurate decisions about the insurance sector. This research focuses on predicting insurance customers using deep learning model & Hybrid Grey Wolf Optimization Algorithm.
A crucial aspect of running a business is risk management. This is because managers frequently have limited knowledge regarding how shifts in the market, competition, and customer preferences may impact their company. Taking risks can sometimes result in greater success but also failure. For instance, the insurance industry heavily relies on risk calculation to increase profits and make better decisions. There are various ways of examining risk, including prescient, illustrative, and choice models. To accurately predict potential insurance customers, researchers are currently working on a new deep-learning model and optimization strategy [5].
This exploration's primary focus is resolving an issue in the Property and Casualty (P&C) insurance industry. This field's customer prediction algorithms and Deep Learning (DL) model require enhancement. This is crucial because accurate predictions can assist insurance companies in making educated choices regarding pricing and risk, ultimately affecting their profitability. To accomplish this, we have looked into using meta-heuristic algorithms like a hybrid grey wolf and whale optimization (HGWWO) [6]. These calculations can assist with advancing the forecast models and working on their precision. The exploration will include estimating the presentation of these models and contrasting them with existing methodologies. By resolving this issue, the scientists desire to add to the improvement of more powerful and proficient client forecast strategies in the P&C insurance industry [7].

II. RELATED WORKS
The different predictive analytic models such as Decision Tree, Regression Model, ANN -Artificial Neural Network, Bayesian Statistics, Gradient Boost Method, Ensemble Learning, SVM-Support Vector Machine, k-NN -k-nearest neighbor, and Time-series analysis have been analyzed and elaborated with the implementation methods and research gaps Y. Qu, Colorado Technical University, USA. (e-mail: yqu coloradotech.edu) @ @

Sushanth Manakhari and Yanzhen Qu
Improving the Accuracy and Performance of Deep Learning Model by Applying Hybrid Grey Wolf Whale Optimizer to P&C Insurance Data in the study along with the Data analysis tools and Quantitative analysis tools. This section discusses the big data tools, concepts, and platforms useful for further direction. Moreover, different researchers mainly focus on resolving the issues by developing an efficient model that can provide reliable results. Certainly, research focuses on developing a predictive model based on big data analytics and deep learning (DL) technology to manage and analyze potential customers in the insurance industry.

A. Hybrid Grey Wolf -Whale Optimization
Prior studies have focused on Hybrid Grey Wolf-Whale Optimization Algorithm and its efficiency in predicting future outcomes [2]. The researchers in the studies have stated that the HGWWO model effectively predicts future outcomes. However, future research should be conducted to know the accuracy and efficiency of this model in various fields. Future researchers can utilize the past studies conducted on HGWWO as the basis for their future research. Future researchers have the opportunity to provide much new information and fill the current research gap in the literature [2].
Imagine you have a group of grey wolves and a group of whales working together to solve a problem. Each wolf and whale represent a potential solution to the problem. Grey Wolf Optimization: The grey wolf part of the algorithm is inspired by the social behavior of grey wolves in nature. Wolves are known for their hunting strategies, where they work together as a pack to catch their prey. Similarly, the grey wolves collaborate and communicate in the algorithm to find the best solution. Whale Optimization: The whale part of the algorithm is inspired by the behavior of whales in the ocean. Whales are known for their ability to navigate and communicate over long distances in search of food. In the algorithm, the whales explore the problem space and communicate their findings to improve the solution. Hybrid Approach: The algorithm combines the strengths of both the grey wolf optimization and whale optimization techniques. It leverages the collaborative nature of grey wolves and the exploration capabilities of whales to find an optimal solution [2].
The algorithm works in iterations or generations. In each generation, the grey wolves and whales update their positions based on a fitness function, which evaluates how good their solutions are. The individuals with better solutions have a higher chance of survival and reproduction, while the weaker ones are gradually eliminated. The algorithm aims to find the best possible solution to a problem through collaboration, exploration, and selection. It mimics the natural behavior of wolves and whales to optimize and improve the solution over time. Overall, the Hybrid Grey Wolf Whale Optimization algorithm uses grey wolves and whales' collective intelligence and exploration abilities to solve complex problems efficiently [2].

B. Deep Learning Optimization
Deep learning technology has no doubt revolutionized the world and allowed it to process big data quickly. Due to deep learning technologies, many corporations can now meet their customer's needs and preferences more efficiently. They can provide their customers best customer service. Deep learning has improved the process of forecasting, due to which businesses become able to provide products on time. The latest technologies have undoubtedly improved operational efficiency and accuracy to many extents. Many research studies have shown the success of Deep Learning technologies in organizations [7].
This concept is appropriate for suggestions of prospective customers, and here customized recommendations are developed with the provided available data based on the potential insured. The BN-based method used in this study was scalable for inference and training. Therefore, the implementation process was simple in a distributed fashion. Moreover, the process has been compared with the procured results to find its accuracy. Nevertheless, the constructive graph used in this study was faster during the training process, but the inference time was expensive, which should be refrained in the future [9].

C. Artificial Intelligence and Machine Learning
Many existing studies have provided detailed information regarding how AI and machine learning can analyze insurance corporation customers. According to those studies, insurance customers must be differentiated based on different variables. The researchers in the studies have utilized C45 techniques and Naive Bayes with certain factors, including job, age, region, gender, and marital status. The method was implemented accurately by 90.11%. The study shows that machine learning or AI techniques can effectively predict customers in the insurance sector [7].
AI and ML algorithms aim to find insurance products that are suitable for customers based on similar features and can work on new and existing customers. Clients are affected by various items that can be accessed from the lookout. Specialists or focal points prescribed to clients are selected based on client information and history. Insurance companies aim to anticipate customer needs and provide relevant products. With a DLbased approach that uses external data to recommend to potential customers, Bayesian Networks are used to model the systems [3].
In addition, dealing with an insurance company requires unique characteristics of domain size, item complexity, and customer expertise. On the other hand, protection is a concern due to limitations, framework communication, and the ability to focus. Market-oriented valuation also includes the concepts of shareholder value (SHV) and customer lifetime value (CLV). Common SHV models include high collection levels and disaggregated revenue when targeting individual clients. However, in most cases, estimating income and costs and integrating retention rates required much work.
With product marketing functions, most insurance companies need help with prospecting. Important attributes such as marital status, service, age, gender, and children are used from various sample datasets. With 71.7% accuracy, 79.1% AUC, and 77.6% recall, the NN-Neural Network algorithms outperformed the classification algorithms. However, some criteria were multi-criteria questions that needed to be clarified and made easier to analyze during the evaluation process. As a result, the expectation-performance gap in life insurance has been difficult to bridge and claims payout in an industry with multiple expectations has taken much effort. Various studies have been conducted to improve the quality of service; despite this, thorough research on neural networks is lacking [6].
In addition, insurers are trying to reshape the insurance industry by using various insurance technologies to improve and update existing insurance products, develop new strategies, and create new ones. Risk prediction using DL methods uses conceptual models. The basic parts include gambling anticipation, driving style placement, risk display, information change, and mining measurement. In most cases, DL data extraction methods determine criteria mining factors. Subsequently, a deeply flawed autoencoder was used with extraordinary problems and vulnerabilities that we should focus on from now on. The calculated models are used in selected emotional support networks that help guarantors (e.g., vehicles) measure the policyholder's risk concerning specific limits [5].
Insurance agencies depend on the fascination of expected clients to foster their business. To accomplish this, they frequently collect customer records, claim history, and financial transaction data online. This data allows executives and managers to understand better customers' preferences and behaviors, which in turn helps to encourage customers to stay with the company for longer and purchase additional policies [4].
Insurance companies typically focus on two main factors when analyzing customer behavior: claim risk and premium benefit. Customers can be divided into groups based on these factors: those with high profit but high risk, low profit but high risk, high profit but low risk, low profit but high risk, and low profit but high risk. However, there is a growing concern about ethical discrimination in the insurance business, especially risk prediction algorithms [5]. Ethical discrimination analysis ensures that predictive models do not discriminate and adhere to ethical guidelines established by regulators and business experts to address this issue. Insurance companies frequently employ machine learning methods to train these predictive models on relevant data [6]. These models can analyze potential customers better and determine whether or not they are a good fit for the business because the primary input and output data they use are frequently based on details about existing customers. The more advanced the AI approaches are, the better the chances of predicting the risks related to any specified domain [10].

D. Deep Learning Conceptual Model
A conceptual model utilized in risk identification employs the data values while considering the features and DL capabilities. The components model includes risk prediction, driving-style detection, risk modeling, criteria mining, and data transformation. The criteria data mining component depends on DL for extracting from the dataset. Moreover, the deep sparse autoencoder used in this study performs well in similar issues. The risk scores are compared with the model during the risk calculation to evaluate the framework's accuracy [8].

E. Bi-Level Approaches
The two bi-level techniques were utilized to develop a loan prospecting technique. Therefore, a non-hierarchical approach executes the two classification tasks and tests the performance by integrating predictions. Further, the hierarchical method has been utilized in the one classifier predictions, and these approaches utilize convex combinations during the prediction tasks.
Moreover, the prediction tasks were used as binary classification issues, with the classical ML and DL classification methods tested [3].

F. FCM-Fuzzy C-Means Clustering
Fuzzy clustering models are utilized with the FCM -Fuzzy c-means and MWOA-modified whale optimization algorithm. Therefore, the quantitative clustering technique's effectiveness has been considered with the existing metrics with the sampling method to optimize the cluster centroids applied in the automobile insurance fraud detection system (AIFDS). The AIFDS has been employed with the sample data by eradicating the outliers by utilizing the fuzzy clustering technique with the modified dataset with advanced classifiers, including Decision Tree, LightGBM, RF, XGBoost, and CATBoost [9]. Moreover, the classifiers are estimated by measuring certain parameters such as accuracy, specificity, and sensitivity. Eventually, the AIFDS comprising fuzzy clustering following the CATBoost and MWOA performs better than others [5].

G. LDA -Latent Dirichlet -Based Technique
LDA-based text analytics has been utilized in the fraud detection of automobile insurance. Therefore, the LDA is used in text feature extraction hidden in the description of text that occurred in the accident claims. Further, DNNs train the data, including numeric and text features, for identifying fraudulent claims and following the utilization of the fraud dataset used in real-world insurance. The experimental results. The experimental results demonstrate that the DNN is widely performed using ML models, including SVM and RF. Hence, this framework integrated LDA and DNN and was considered an appropriate tool for automobile insurance fraud detection [14].

A. Problem Statement
The problem on which this study is focusing is the Property & Casualty (P&C) insurance industry. There are various models which are used for predicting insurance customers, such as Deep Learning (DL) model, Hybrid grey wolf and Whale optimization (HGWWO), Gradient -Boosting Decision Tree (GBDT) model, and Deep Neural Network (DNN), etc. Many research studies have proven the success of machine learning, deep learning, and artificial intelligence productive models in the insurance sector [3]. In addition, many studies have provided detail about Naïve Bayes, Random Forest, and logistic Regression utilization for predicting customers in the insurance sector as well. However, there is a need to investigate the accuracy and effectiveness of hybrid grey wolf and whale optimization (HGWWO) in the insurance sector [4]. This research aims to implement this algorithm to check how efficiently it predicts customers in the insurance sector. Overall, it can be said that the prediction accuracy of this model will be investigated.

B. Hypothesis Statement
If Insurance Corporations are going to apply Hybrid Grey Wolf and Whale Optimization (HGWWO) for Property & Casualty (P&C) insurance data, then the prediction accuracy or performance of the model has experienced significant improvement when compared with other predictive models [5].

C. Research Question
How will implementing Hybrid Grey Wolf and Whale Optimization (HGWWO) for Property & Casualty (P&C) insurance data impact the performance of prediction accuracy?

A. Method
To conduct this study quantitative research methodology has been implemented. A significant amount of quantitative data has been gathered from various research studies. The data regarding Artificial intelligence, Deep Learning, Machine learning, and other predictive technologies and methods are collected. In this study, a reliable quantitative analytical big data method has been developed during experimentation. In the study, Hybrid Grey Wolf and Whale Optimization (HGWWO) is utilized for evaluating customer behavior of the customers of insurance organizations. A huge amount of secondary data has been collected for conducting the research [7].

B. Population and Sample
Indeed, including all the studies and various data sets in a single study is impossible. Therefore, the sample has been taken from the data sets. For collecting data, 79,853 observational records that previous researchers collected have been included in this research study. The previous researchers have collected data by conducting interviews and surveys. The sampling method allows easy access to the customer or respondent data. This research study has a sample size of 25,000 records. It is important to know that the size of the data set will significantly impact the study Field results [8].

C. Evaluation of Data
After the collection of the huge amount of data, the next step is the analysis of data so that the reliability and validity of the collected data can be analyzed. Data analysis is considered one of the most important phases of research methodology because research results cannot be formulated without data analysis. Various statistical approaches are going to be utilized for analyzing the data. Statistical approaches such as Mean Squared Error, Relative Root Mean Square, and Mean Absolute Percentage Error were utilized. After data analysis, the results are formulated to answer the research question in detail.

V. EXPERIMENT AND RESULTS
This study selects the optimal parameters for deep learning models using the HGWWO algorithm. The number of networks, the number of dense layers, the normalization value, the number of skip layers, and the dropout rate are some of the optimal parameters for deep learning model. Utilizing the HGWWO calculation increases the accuracy of deep learning model [9].

A. Results
The research results have directly illustrated how the HGWWO algorithm can solve thresholds in an insurance data set. In the example, two records from the insurance data set are included with age and premium information. In addition, the study initialized the search agents-wolves and whalesinside the search space, and the fitness value of each search agent relative to the objective function to be optimized were determined. A neural network with two input nodes and one output node predicts insurance premiums based on the customer's age [6].
Two records from the insurance dataset problem are considered for the said topic, for which Table I is given below. Randomly initialize the search agents (wolves and whales) within the search space. Let's assume that we have two wolves and two whales, and we randomly initialize their positions and velocities as shown in Table II. Calculate the fitness value of each search agent based on the objective function to be optimized. Let's assume that the objective function is Mean Squared Error (MSE), which we want to minimize. Assume we have a neural network with two input nodes and one output node that predicts the insurance premium based on the customer's age. We use the dataset to train the neural network and then use it to predict the insurance premiums for the two records in the dataset. After updating the position and velocity of the search agents using GWO and WOA, we need to apply boundary constraints to ensure that they remain within the search space. In this example, the search space for both age and premium is between 0 and 100. Thus, we need to check whether the updated positions of the search agents violate this constraint and adjust them accordingly. If a search agent's position is outside the search space S, it is repositioned randomly within the search space [36].  We can update the best search agent and its fitness value based on the fitness values of all search agents. In this case, the fitness value of a search agent is its prediction error. The search agent with the lowest prediction error is considered the best search agent [33].
The research discussed the Hybrid Gray Wolf-Whale Optimization algorithm and the steps involved in the optimization process. The steps are HGWO initialization, evaluation, and optimization. I showed how to calculate the distance to the best whale, the average position of the wolves, and the current position of each search agent using pseudocode for the HGWO algorithm [10].  Fig. 3 shows that the HGWWO-DNN model performs well, with training and validation accuracy consistently above 96%. However, there is still a small gap between the training and validation accuracy, particularly in epochs 9 and 10, where the validation accuracy is slightly lower than the training accuracy. This suggests that the model may be slightly overfitting to the training data.   4 shows the HOpto DNN model performs well, with training and validation loss decreasing consistently over the epochs. However, there is a small gap between the training loss and validation loss, particularly in epochs 9 and 10, where the validation loss is slightly higher than the training loss. This suggests that the model may be slightly overfitting to the training data.
The review provides visual representations for the diagrams that exhibit the exactness and losses of the HOpto DNN model during training and validation. The diagrams show that the model is performing great. However, overfitting at higher epochs is possible than the training data. The study suggests implementing regularization techniques, hyperparameter tuning, data augmentation, cross-validation, and ensemble learning to address this issue [26].
The results were drafted and visualized using various performance comparison metrics. We noticed that the accuracy for HOpto algorithm results was high and yielded the highest results. Accuracy is a metric that describes how the model performs across all classes. It is useful when all classes are of equal importance. It is calculated as the ratio between the number of correct predictions to the total number of predictions.
Based on the accuracy scores provided in Fig. 5, the proposed model, HOpto DNN, outperformed the other models (XGBoost, RandomForest, and Decision Tree) with an accuracy of approximately 98%. This indicates that the HOpto DNN model was able to learn the patterns in the data better than the other models and make more accurate predictions.
The Mean Absolute Percentage Error (MAPE) is one of the most used KPIs to measure forecast accuracy. MAPE is the sum of the individual absolute errors divided by the demand (each period separately). It is the average of the percentage errors [22].
Based on the Mean Absolute Percentage Error (MAPE) scores provided in Fig. 6, the proposed model, HOpto DNN, outperformed the other models (XGBoost, RandomForest, and Decision Tree) with a MAPE score of approximately 3%. This shows that the HOpto DNN model could make predictions that were closer to the actual values than the other models, on average. The Mean Squared Error (MSE) is the simplest and most common loss function, often taught in introductory Machine Learning courses. To calculate the MSE, you take the difference between your model's predictions and the ground truth, square it, and average it across the whole dataset. The MSE will never be negative since we are always squaring the errors.
Based on Fig. 7, the proposed model, HOpto DNN, appears to outperform the other models regarding MAPE and MSE. The lower the MAPE and MSE values, the better the model performance. Compared to XGBoost, Random Forest, and Decision Tree models, the HOpto DNN has lower MAPE and MSE values, indicating better performance. Based on the provided information, the proposed model, HOpto DNN, is better-performing than the others in terms of MAPE and MSE. Root mean square error or root mean square deviation is one of the most commonly used measures for evaluating the quality of predictions. It shows how far predictions fall from measured true values using Euclidean distance. To compute RMSE, calculate the residual (difference between prediction and truth) for each data point, compute the norm of residual for each data point, compute the mean of residuals, and take the square root of that mean. RMSE is commonly used in supervised learning applications, as RMSE uses and needs true measurements at each predicted data point.  The proposed model HOpto DNN performs the best in all the evaluation metrics -MAPE, MSE, and RMSE. Its MAPE value is the lowest, indicating the smallest average percentage error in the predictions. Its MSE value is also the lowest, indicating the smallest average squared error in the predictions. Its RMSE value is the lowest, indicating the smallest root mean squared error in the predictions [28].
Comparing it with the other models, we can see that XGboost and Decision Tree perform similarly regarding RMSE, while Random Forest performs slightly worse. Regarding MAPE and MSE, Random Forest performs the worst out of all the models, while XGboost and Decision Tree perform better but still worse than HOpto DNN [35].
Therefore, we can conclude that the proposed model HOpto DNN outperforms all the other models in terms of prediction accuracy, as indicated by the low MAPE, MSE, and RMSE.

B. Summary
In summary, we have used performance comparison metrics for evaluating different machine learning models. These models include the HOpto algorithm, XGBoost, RandomForest, and Decision Tree. The models are evaluated based on different metrics like accuracy, mean absolute percentage error (MAPE), mean squared error (MSE), root mean square error (RMSE), and confusion matrix.
The HGWWO algorithm yielded the highest accuracy, indicating that it performed better in learning the patterns in the data and making more accurate predictions than the other models. The proposed model, HGWWO-DNN, outperformed the other models regarding MAPE, MSE, and RMSE, indicating better performance. The confusion matrix helped determine the model's performance regarding false positives and negatives. HOpto DNN performed better in all metrics and showed a lower percentage of false positives and false negatives, indicating better overall performance.
The proposed model, HGWWO-DNN, outperformed the other models regarding prediction accuracy, as indicated by the low values of MAPE, MSE, and RMSE and the lower percentage of false positives and false negatives in the confusion matrix. The given summary provides a useful overview of different evaluation metrics that can be used to compare machine learning models.

VI. CONCLUSION
This paper has demonstrated that by applying both HGWWO algorithm and a Deep Learning model can improve the accuracy and performance in analyzing a P&C insurance dataset.
The study findings on the HGWWO algorithm are subject to several limitations that may affect their generalizability and reliability. One of the main challenges is related to the sampling of P&C insurance data, which can be biased and incomplete due to several reasons, including the lack of standardized data collection procedures, differences in data quality across sources, and issues related to data privacy and confidentiality. Moreover, the study findings may need to be more generalizable to other types of insurance, such as life or health insurance, or to different geographical regions, as insurance markets vary significantly across countries and regions. Additionally, the reliability and trustworthiness of the data used in the study can be influenced by factors such as data errors, missing data, or incomplete data, which may introduce noise and bias into the analysis.
In conclusion, the research proposes using the HGWWO algorithm and the HGWWO-DNN model to improve prediction accuracy in the Property and Casualty insurance industry. The study findings on the HGWWO algorithm provide valuable insights into developing optimization algorithms for insurance pricing and risk management. Despite the limitations related to data sampling, generalizability, and data trustworthiness and reliability, the study's rigor and scientific validity enhance the reliability and robustness of the results and provide a useful reference for future research in the field [31]. In his professional career, he has worked at multiple IT organizations on various Software Development, Architecture, and Management levels. He is a Technical Lead and Subject Matter Expert for a Fortune 500 company in Erie, PA, USA.
His current and future research interests are in Artificial Intelligence, Data Science, Machine Learning, Deep Learning, etc., that can be applied in financial and insurance institutions or other industries to improve operations and risk management processes in an automated structure.