Cluster Head Selection in Device to Device (D2D) Communication Based on Weighted Network Performance Factor

In this study, an unsupervised machine learning algorithm, Self Organizing Map was utilized to cluster D2D User Equipment using their network values as inputs. A weighting factor referred to as Hardware Sensing Factor (HSF) was formulated to take into account the device’s channel quality and the status of its underlying hardware circuitry. The values of the HSF were used as inputs to cluster the devices and to select cluster head for each cluster. The performance of SOM when HSF was input was compared with the performance when RSSI, RSRP or RSRQ was used as input. The comparison showed that the use of HSF as input to SOM cluster algorithm gave better cluster performance than the use of respective network values such as RSSI, RSRP or RSRQ. In addition, the use of HSF as input data to both SOM and K-Means algorithms showed that SOM cluster formation has better performance than K-Means algorithm


Introduction
Device to device (D2D) communication is a technology that enables direct communication between User Equipment (UE) with or without the involvement of the evolved NodeB (eNodeB) or base station.Thus, the use of D2D significantly reduces the intensity of traffic between the eNodeB and the UEs and it improves network utilization and aids in saving scarce network resources.D2D communication has been integrated in the development of 5G cellular system and is expected to play a key role in the development of 6G networks [1], [2].
The authors in [2] pointed out that the recent attention drawn by D2D is not only because of its benefits but also its diverse applications.D2D can offload traffic from the core network thus reducing latency as well as the overload density of the network.In addition, in emergency situations arising from no network coverage due to disruption of cellular services or destruction of network infrastructures (either by man-made crisis or natural disasters), proximate devices can establish and maintain communications with each other.Furthermore, D2D communications can help to extend network coverage to the cell edge users who normally experience increase fading and poor signal strength.In such situations, a device can be selected to act as relay between the edge users and the base station.Also, D2D can be employed in reliable health monitoring system and in the dissemination of information to targeted audience such as in shopping malls or advert promotion utilizing D2D social-aware communications.
There are key factors that influence the effectiveness of D2D communication.They include resource allocation, cluster formation, mobility, and connection management [3].This implies that cluster formation is one of the key factors that determine the effectiveness of D2D communication.This assertion is corroborated by authors in [1] who noted that D2D devices are required to form clusters in order to take full advantage of various new services introduced by 3GPP release.The technique of cluster formation enables the division of network into groups of geographically proximate devices, thus efficiently optimizing and simplifying network functions.Cluster formation enables neighboring devices to communicate and share network resources.This saves resources such as energy and bandwidth [4].
Various clustering algorithms have been applied in literatures to solve cluster formation in D2D enabled networks.According to [5], cluster algorithms can be categorized into hierarchical algorithms, distance/similarity-based algorithms; density clustering algorithms, graph theory, and squared error-based algorithms.In addition, authors such as [6] and [5] have recently applied machine learning to form clusters in D2D communications.
But [7] stated that a good D2D clustering algorithm is marked by key features, namely: efficient identification/selection of cluster head (CH) and the cluster members (CMs) in addition to efficient intra/inter-cluster communication.Thus, efficient selection of CH is imperative in cluster formation.As pointed out by [1], the choice of CH during cluster formation should be carefully and optimally done.The reason is that the selection of any device as the CH affects the network parameters such as energy efficiency and QoS experienced by the CMs in the cluster.In addition, in terms of energy dissipation and device mobility, the devices selected to act as the CHs must be reliable.This is required to avoid service or session discontinuity during D2D connections [3].In this study, Self Organizing Map (SOM) which is an unsupervised machine learning algorithm was adopted to efficiently select CHs that satisfied predefined network performance metrics.

Related Literatures
A very important feature a good clustering algorithm should exhibit is the effective selection of CH from among proximate devices.The CH is selected to act as the coordinator of the CMs in a cluster.It as well coordinates both intra and inter-cluster communications [1], [8], [9].In addition, in event of transmission failure, the CH can help the CMs retransmit any stored information according to [10].There are rules and decisions that dictate the selection of CH.Such rules are needed because the choice of CH selection influences to a great extent the reliability and the stability of a cluster.In situations such as natural or man-made disaster, national security or public safety, there must be optimal CH selection so that if the base station is dysfunctional or damaged, the CH can take up some of the responsibilities of the failed base station [11].Also, it was stated by [12] that the clustering algorithm adopted and the criteria used in CH selection are critical and important factors that influence network energy efficiency.
To satisfy the requirements for effective CH selection, various criteria for CH selection had been proposed in many literatures.The selection of CH based on the distance between the associated CMs was proposed by authors such as [11], [13].On the other hand, the authors in [1] proposed the use of QoS parameters as the criterion for CH selection.This was necessitated due to the variation of the channel features and resources that exist between each CM and the CH within a cluster.A combination of distance and Signal to Interference to Noise Ratio (RSRQ) as the basis of CH selection was adopted by [14] and the results from their work showed that system capacity and energy efficiency are improved.An optimized CH selection that made use of three factors namely the device energy, the distance, and the social tie (or trust) between the proximate devices was proposed by [15].In this study, the authors presented that the scheme that utilized the three factors performed better in terms of average social tie than a scheme that utilizes only distance.In addition, the three-factor scheme performs better in terms of energy consumption and transmission reliability than the scheme that employed only distance metric.Similarly, a distributed dynamic network assisted clustering scheme that used improved K-means algorithm to select CH and cluster devices based on the channel quality and the device location was simulated and analyzed by [12].It was observed that the proposed scheme improves the efficiency of the network.
Furthermore, the authors in [16] proposed the use of social interactions between users and a social aware approach to select suitable CH.The authors opined that the choice of CH selection should be based on the type of social relationship that exists between it and other users.It was noted that by considering these social attributes, the D2D network efficiency can be enhanced.Also, the criterion of using social tie or trustworthiness among the D2D users to optimally select CH ensures reliability as related to protection of data integrity and privacy.
On the other hand, a CH that should among other roles acts as a relay should be efficiently and optimally chosen.When a CH that acts as a relay is optimally selected, the network efficiency increases, and the communication range and the total coverage of the network are extended.The authors in [17] adopted the combination of power control, channel/resource allocation and regional boundaries, and social ties among users to select the CHs that would act as relays.The outcome of their proposal showed that the data rate of the network and the D2D network performance are enhanced.
Some literatures considered the use of weighted technique to select CH.In this approach, the target devices are assigned weights based on certain metrics.Depending on the logic behind the weight assignment, the device that satisfied the condition of suitable weight measure is selected as the CH.One of the literatures that adopted this technique is presented by [18].In this publication, the authors assigned weights to the target devices.The assigned weights are the number of CMs to support, the signal strength received, the cumulative time a device can be a CH, the capability of the device, and the tendency of the potential CH to be mobile.After each device broadcast its own weights, the device with minimum weight is selected as the CH.The results from their work showed that the weighed technique ensures a high rate of communication and device discovery, though the discovery process is energy intensive.
In this research study, a weighted approach was adopted to select appropriate CH per a cluster.To achieve this, weights were assigned to measured network performance parameters such as the Referenced Received Signal Power (RSRP), Received Signal Strength Indicator (RSSI), and Referenced Received Signal Quality (RSRQ).The devices with high signal quantity and quality represented by the assigned weights were selected as the appropriate CHs.Other devices with lesser weights were chosen as the CMs.The rest of the work is divided into the following sections: the methodology is described in section III; the results and discussions are presented in section IV and section V contains the conclusion.

Methodology
Machine learning algorithm such as SOM needs data to form clusters.The data can be collected by the UEs or by the Radio Access Network (RAN).In this study, hundred sample data of network parameters (RSSI, RSRP and RSRQ) were collected within 250 m radius around a base station.These hundred sample data were collected using four UE from different vendors and they were used to represent hundred UEs served by the base station.
The network parameter (RSSI, RSRP, or RSRQ) of each device was assigned a value representing its weight.The values of these weights were used as inputs to SOM clustering algorithm.The procedures adopted in the formation of clusters are described as follows:

Network Performance Weight Formulations
Modern smart mobile devices can track and measure network parameters such as the RSSI, RSRP, RSRQ, and so on.But an amazing thing is that mobile devices from different vendors do not measure and report the same value of a measured network quantity say RSRP for instance, even when they are placed in the same location.As an example, a device from a vendor may report measured value of RSRP at a place to be −82 dBm, another mobile device from a different vendor placed side by side to the former may report a measured value of RSRP at that place as −84 dBm.It gets more amazing when a variation of a measured quantity is observed among equipment from the same vendor either of the same or different model.
Fig. 1 shows the measurements of the network parameters by the same UE placed in the same location, but at It is evident that the cause of these variations in measured network parameter is related to two factors.First is the device's hardware constitution.A device's hardware constitution could be attributed to its processor performance or stability, the strength of the battery, the device's antenna selectivity, its sensitivity to interference and surrounding/thermal noise, or worst, the problem could be as a result of factory defects.The second factor is the device's link or channel quality.A good channel quality presents better results than a poor channel.
In this research work, a weighted index termed Hardware Sensing Factor (HSF) was used to select appropriate CH from each cluster.The technique involves the assignment of numerical values as weights to the ranges of values of the network parameters.These network parameters have various reporting ranges.And each range describes  the quality or strength of the signal represented by the parameter.The research work adopted the classification of these ranges and formulated the weights of each of the parameters.The network parameters along with the reporting ranges and the assigned numerical weights are presented in Table I.
The weight assigned to each of the reporting range of the network metrics is termed an index factor.In Table I, the index factors of the parameters are represented as Ri, Si, and Qi.For every measurement from a device, the reported parameters are assigned weights based on the range of the reported value.The weights are summed to obtain the Hardware Sensing Factor of the device at that particular reporting.For instance, if a measurement from a device reported the following values: RSRP = −85 dBm, RSSI = −54 dBm, and RSRQ = −11 dB, the corresponding index factors according to Table I are: Ri = 0.75, Si = 1.00, and Qi = 0.75.The summation of these index factors gives the value of HSF as 2.50.For any reporting, the HSF will have a maximum value of 3.00.Its minimum value is 1.00.An illustration of the values of HSF obtained from various reporting ranges of the network parameters is shown in Table II.Due to space, only network values from six devices were shown in Table II.
The value of HSF reflects the combined hardware status and network performance of a device at a location.HSF reflects the link or channel condition of each D2D UE as well as the status of its analogue/digital circuits.A good performance yields higher HSF, while a poor performance is represented by lower HSF value.A device with higher HSF is best suited to act as the CH because it has better hardware status and link/channel quality than the rest of the devices.It can help devices with poor channel quality to relay data to and from the base station.

Cluster Algorithm
The cluster algorithm adopted in this study to select and cluster mobile devices is the Self Organizing Map (SOM).SOM is an unsupervised Machine Learning Algorithm.Without supervision aid, SOM can automatically and competitively determine the number of clusters contained in a data set.SOM first determines a winner neuron, which is a neuron that has weight most similar to the data sample.Following this is the updating of the weights of neighbor neurons, which ensures that clusters of neurons with similar weights are formed.Two functions called the learning rate [α (t)] and the neighborhood function [h cj (t)] are utilized in updating the weight vectors.The value of the learning rate is between 0 and 1.While the Gaussian type of the neighborhood function is given as: where d 2 uj is the distance between the winner neuron u and the excited neuron j.The radius of the neighborhood at iteration t is represented by the parameter σ .SOM's algorithm is represented as follows: Determine the number of cluster, represented by the number of output neuron 'n'.

Initialize the output neuron's weight vector Set the values of the learning rate [α (t)] and the neighborhood function [h cj (t)]
While stopping condition is not met For each in input x Update the weight vector w j (t + 1) of the nearest output neuron and the neighboring neurons as: End for Learning rate is reduced Neighborhood parameter is reduced End while In the algorithm, the old weight vector is represented by w j (t) and the new weight is represented as w j (t + 1).The ability of SOM to update both the winner and the neighboring neurons is the key feature it has over vector quantization algorithms.

Cluster Formation Procedure
The values of RSSI, RSRP and RSRQ of 100 devices were assigned numeric weights according to reporting ranges.These weights formed the HSF for each device.The values of the determined HSF were used as input to SOM algorithm to form clusters.In each cluster, a device with highest HSF value was chosen as the CH, the remaining devices became the CMs.In this study, 3 × 3 SOM architecture was adopted, thus there are nine clusters used to cluster the devices.The flow chart of the cluster formation procedure is represented in Fig. 3.

Results and Analysis
SOM algorithm has the ability to reduce high dimension input variables to two dimension outputs.This gives SOM a good visual display.SOM is able to display the positions of the clusters and the number of devices in each cluster.The output of SOM cluster formation using HSF values is shown in Fig. 4. In the figure, the neurons (clusters) with their respective number of UEs are indicated.The     ID or CH ID) used to identify the UEs as well as the CH is contained in column 2. The third column is the values of the HSF for each selected CH, while the fourth column indicates the number of CMs for each cluster.Note that the number of CMs is one less than the total number of UEs in a cluster.In this study the performance of using HSF as input to the SOM cluster algorithm was compared with the performance when the network parameters namely: RSSI, RSRP and RSRQ are respectively used as inputs.The outputs of SOM algorithm using RSSI, RSRP and RSRQ as inputs are shown in Figs.5a-5c where the variable a indicates the average inter-cluster distance, the variable b represents the nearest cluster distance.Silhouette Coefficient works on two premises.Firstly, the distance between the data points in a cluster should be close.Secondly, the distance between two clusters should be considerable far.Silhouette Coefficient is the measure of the similarity of data points in a cluster when compared with data points in other clusters.Its value ranges from −1 to + 1.If Sil is 1, it indicates effective cluster.If Sil has a value of −1, it signifies that the performance is worst,  but a value of Sil equals 0 indicates that the clusters are overlapped.
On the basis of even distribution of devices, when compared with Figs.5(a)-5(c) where RSSI, RSRP and RSRQ were respectively used as inputs, Fig. 4 shows that the CMs were evenly distributed when HSF was used as cluster input.The number of CMs for each clustering input is indicated in Fig. 6, where it is shown that the use of HSF as cluster formation input ensures even distribution of the CMs among the CHs when compared with the use of single network parameters such as RSSI, RSRP or RSRQ.
On the basis of Silhouette Coefficients, Fig. 7 shows the Silhouette Coefficient plots of HSF, while Figs.8(a)-8(c) show respectively the Silhouette Coefficient plots of RSSI, RSRP and RSRQ.
A comparison of Silhouette Coefficient plots of Figs. 7  and 8 shows that the use of HSF has better performance than the use of any of the network performance parameters (i.e., RSSI, RSRP or RSRQ).
The Silhouette Coefficients plot in Fig. 7 indicates that most of the Silhouette plots of HSF data have high Silhouette Coefficients values.None of the HSF data points have a negative value.These two characteristics show good cluster formation when HSF is input to SOM algorithm.But in Fig. 8(a), a data point of RSSI in cluster 5 has a negative value.Similarly, in Fig. 8(b), a data point of RSRP in cluster 2 and cluster 9 respectively have negative values.Also, in Fig. 8(c), two data points of RSRQ in cluster 1 have negative values, and a data point on cluster 6 has negative value.The negative values of these data  points show that they are not best fitted to the respective cluster.Thus, it can be deduced from Figs. 6-8 that the use of HSF as input to SOM algorithm shows the best even distribution of the devices among the CHs and the best clustering of similar data as indicated in the Silhouette Coefficients plots.
Furthermore, the ability of SOM to cluster similar data inputs was compared with the performance of K-Means clustering algorithm.Fig. 9 shows the output result of K-Means algorithm when HSF was utilized as cluster input.Table IV displays the cluster statistics indicating the number of CMs for each cluster, while Fig. 10 is the Silhouette Coefficients plots of K-Means cluster formation.
A comparison of the performance of SOM and K-Means algorithms showed that in terms of even distribution of the CMs (Fig. 11) SOM and algorithms performed better than K-Means clustering algorithm.In addition, in terms of cluster algorithm validation using Silhouette Coefficients plots, a comparison of Figs.

Conclusion
In this study, a weighting factor termed Hardware Sensing Factor (HSF) was formulated to take into account the channel status or link quality of a D2D UE.The HSF also reflects the status of a device's analogue and digital circuits.Self Organizing Map which is an unsupervised machine learning algorithm was used to form clusters of UEs using the HSF values as the input.A device with the best hardware condition and have best channel quality represented by the value of its HSF was selected as the CH for each cluster.The performance of SOM when HSF was input was compared with the performance when RSSI, RSRP or RSRQ was input respectively.The comparison showed that the use of HSF as input to SOM cluster algorithm gave better cluster performance than the use of respective network values such as RSSI, RSRP or RSRQ.Also, it was shown that when HSF is the input data, SOM algorithm has better performance when compared with the performance of K-Means clustering algorithm.

Cluster
Head Selection in Device to Device (D2D) Communication Based on Weighted Network Performance Factor Idigo et al.
Fig. 1.Network measurement by the same UE, at same location, but different time interval.

Fig. 2 .
Fig. 2. Network measurement by different UEs, at same location and same time interval.

Vol 7 |
Issue 6 | November 2023 21 Cluster Head Selection in Device to Device (D2D) Communication Based on Weighted Network Performance Factor Idigo et al.
Fig. 3. Flow chart of cluster formation procedure.

Cluster
Head Selection in Device to Device (D2D) Communication Based on Weighted Network Performance Factor Idigo et al.

Fig. 6 .
Fig. 6.Comparison of number of CMs for each clustering input.
, respectively.The performance of SOM for each input (HSF, RSSI, RSRP, RSRQ) was analyzed in two aspects.The first is the distribution of UEs among the clusters.The distribution of the devices in the clusters influences the network load on the CHs.When the devices are evenly distributed among the clusters, the CHs have an even or equal number of CMs and thus an even load.If the devices are unevenly distributed among the clusters, the CHs with large CMs have high loads compared with the CHs with less CMs.The second aspect is the Silhouette Coefficient plots of the clusters.The expression for Silhouette Coefficient is shown in (3): Sil = b − a max(a, b)
7 and 10 showed that SOM algorithm has better performance than K-Means algorithm.Vol 7 | Issue 6 | November 2023 25 Cluster Head Selection in Device to Device (D2D) Communication Based on Weighted Network Performance Factor Idigo et al.

TABLE I :
Formulation of Weights for the Parameters

TABLE III :
Cluster Statistics of SOM: HSF as Input The statistics of the cluster formation of Fig. 4 are represented in Table III.The cluster number (there are nine clusters) is contained in column 1.A dummy number (UE

TABLE IV :
Cluster Statistics of K-Means: HSF as Input