AN INTELLIGENT DENGUE HEMORRHAGIC FEVER SEVERITY LEVEL DETECTION BASED ON DEEP NEURAL NETWORK APPROACH

Dengue hemorrhagic fever is one of the most dangerous diseases which often leads to death for the sufferer due to delays or improper handling of the severity that has occurred. In determining that severity level, a specialist analyzes it from the symptoms and blood testing results. This research was developed to produce a system by applying Deep Neural Network approach that is able to give the same analytical ability as a doctor, so that it can give fast and precise decision of dengue handling. The research stages consisted of normalizing data to 0 – 1 intervals by Min-Max method, training data into multilayer networks with fully connected and partially connected schemes to produce the best weights, validating data and final testing. From the use of network parameters as much as 10 input units, 1 bias, 2 hidden layers, 2 output units, learning rate of 0.3, epoch 1000, tolerance rate 0.02, threshold 0.5, the system succeeded in generating a maximum accuracy of 95% in data learning (60 data), 87.5% on data learning and non-learning (40 data), 85% on non-learning data (20 data).


Introduction
Dengue Hemorrhagic Fever (DHF) is a disease that is quite dangerous because it can cause death.In Indonesia it has even recorded as many as 126.675 people have been infected by DHF in 2016 [18].This is because the disease has not found the vaccine for prevention and treatment, and also the delay in handling.
Generally, in conducting an analysis of dengue hemorrhagic fever, the action performed by paramedics or doctors is to look at the symptoms that occur such as red spots or rash on the skin, high fever, nausea, vomiting, and symptoms of pain in bone or joints felt by the patient.From these actions, to ensure that the illness will then proceed by taking blood samples to be tested in the lab.From these results then can be sure what volume 12, issue 2, June 2019 disease is suffered by patients, where the analysis can only be done by a specialist.This is the basis for the researchers to develop a system to analyze dengue fever, where the severity of the disease is determined based on the amount of blood cells, long days of fever, and blood pressure through the Deep Neural Network.
Research on dengue fever analysis has also been done by some researchers [4], as done by Hariman & Noviar in 2014 [1] who developed an expert system to diagnose dengue disease by Forward Chaining method through symptomatic input.But from these studies, unfortunately no definitive test results about the accuracy of the system developed.Other studies have also been conducted by Susanto, N.F [2], which implements Fuzzy Expert System method to detect early DHF disease based on symptoms arising from the patient, and from the research, the system accuracy is only 70%.As well as the research Widodo, W. et el [6] using Artificial Neural Network methods, which yielded a high accuracy of 74%.However, from these studies, the data used as input only based on symptoms of the disease and it still not strong enough to prove the exact analysis of its DHF disease.Lack of these three studies, researchers will try to improve on research this time either from the use of its data, which is no longer based on symptoms but through data on the amount of cell content in the blood, blood pressure, and duration of fever.In addition, researchers will also apply the difference from the use of methods that is applying Deep Neural Network.
Deep Neural Network is part of Machine Learning, which has the same architecture as Neural Network [14].Application of the Deep Neural Network in the medical or health field has a good ability to recognize patterns, especially for complex patterns.As in the Sladojevic, et al [11] study, which implement one of the Deep Neural Network methods, Convolutional Neural Network for examining leaf disease and successfully classifying it with precision reached 96.3%.In addition, in other studies such as Yan, H. et al [13] have also successfully applied the Deep Neural Network method of Multilayer Perceptron Neural Network to diagnose heart disease with an accuracy of more than 90%.Similar to that of Yan, H. et al, on Jahangir's research [12] also applied the Deep Neural Network of Automatic Multilayer Perceptron to the dataset of Pima Indian Diabetes to build a decision support system in helping to determine diabetes with accuracy reached 88.7%.Several studies using neural network method have been conducted by researcher, including research entitled "An Application of Backpropagation Artificial Neural Network Method for Measuring the Severity of Osteoarthritis" [5], which successfully developed a system of measuring the severity of osteoarthritis disease with an accuracy of 66.6%.Then the next research entitled "Implementation of Artificial Neural Network Method in Application Development to Measuring the Severity of Narcotics Substances in Blood" [3] has also been conducted by researchers where from the research has succeeded in producing a system of narcotics severity level detection in the blood with percentage of accuracy between 60-100%.And research relating to the analysis of dengue fever disease also researchers have done by using Perceptron Neural Network [16], which managed to reach the average accuracy level of 65.15%.This research is what researchers want to try to improve and develop again, where in previous research data used only as much as 30 data, the input based on only 8 components of blood cells will now be improved by using more data, input type are more varied, and the use of Deep Neural Network method with more layers and adaptive.So it is expected, the percentage of accuracy obtained in this research will be higher than the previous research and can produce a more reliable dengue fever detection system.

A. Deep Neural Network
Deep Neural Network (DNN) is a method of Machine Learning, which uses the concept of human thinking in learning patterns to help make decisions.Decision-making is done by analyzing patterns from existing examples, just as the way humans learn about things called Deep Learning [7].The DNN itself imitates the structure of the neural network in the human brain comprising the processing unit, the receiver unit (perceptrons), the connective tissue.The DNN was first known as "Artificial Neural Network", where the difference in the word "deep" from the word "depth" means that the formed network has not only two layers (input and output) but also a hidden layer.According to Tuomas Nikoskinsen [14], Deep Neural Network has the same network architecture as Multilayer Perceptron (MLP) Neural Network.But both have different training phases, where in the DNN the number of layers used is added when the value of error obtained in training continues to increase, and vice versa.In addition, the DNN can also implement the Dropout technique, which reduces the unit to prevent overfitting [20].One of the DNN methods that apply this concept is Deep Belief Network (DBN).Here is the architecture of the MLP network [15]: Algorithm in Multilayer Perceptron Neural Network as follows : a) Each input unit on the input layer (Xi, i = 1, ….n) and bias (X0) receive the signal and the signal is propagated to the next unit (hidden layer Zj units).Each unit in the hidden layer is multiplied by the weight and summed and added by the bias weight.
Then calculated according to the activation function used : When used is sigmoid function then the form of the function is [8] : The output signal of the activation is sent to all units in the output layer.b) Each output unit (Yk, k = 1,2,3…m) is multiplied by weight and summed and added by the bias weight : Then recalculated in accordance with the activation function :  In figure 2 above, the blood sample results indicate that the serologic value of NS1 is positive, which means the patient is positively affected by DHF.The example above also shows that the severity level is not included in the blood sample results, so to find out more analysis is needed from specialist doctors.This is what will be done by the system built.

A. Data Collection
The data collection in this research is done by searching data directly from resource persons (former DHF patients) in various hospitals in Indonesia and through internet search.The data used are as many as 100 data, where this amount is taken based on the suitability of sample numbers from the calculation results with Slovin formula [17]: Where  is a number of samples, N is total population, e is a margin error or percentage of the error value that can be tolerated.In this study, the total population (N) used is the total population of Indonesia affected by DHF during the past year (2016) of 126.675 people [18].While the percentage of error value (e) that is still intolerable is a maximum of 10%.Thus, the number of samples () used by researcher are:  = 126.6751+126.675(0.1) 2 = 99.92≈ 100 (11) Here are the data used in this research:  Grade I (0,0), Grade II (0,1), Grade III (1, 0), Grade IV(1,1) The use of 10 input units is based on 8 blood components on the results of laboratory examination of dengue fever patients are always there, blood pressure and long days fever, while the range of input units between 0.0 to 1.0 is based because the use of binary threshold activation that only output 0 and 1, requiring input values having an interval between 0 and 1. Hidden units between 3 and 8 are drawn based on amounts that must be less than the number of input units and greater than the number of output units, to prevent overfitting [22].The number of 2 ouput units is based on the target consisting of 4 categories (Level I, Level II, Level III, Level IV) and binary input value.So the combination of the number of output values required to represent 4 target categories as much as 2 pieces of binary value.Then the learning rate is choosen 0.3 because based on the results of previous studies [16] it was found that the small value of the learning rate can have a better impact in recognizing patterns by making weights changes little by little on each iteration (epoch) compared with the use of big value of learning rate.Then the number of epoch 1000 is the maximum estimate of iteration that may be required in the learning process.While tolerance of 0.02 is the tolerance / maximal value of pattern recognition errors (98% accuracy) that still deserve to be received in the training process.Then the use of threshold 0.5 is choosen because it is in the middle of the network input value interval, which according to Srivastava, N et al [20] 0.5 is the optimal value as the threshold value of the network architecture.This value will be used to determine a neuron in the next hidden layer unit will be activated or not.
If not enabled, then the neuron will be removed or dropout.Here is an example of network architecture that applies a dropout technique to a hidden layer with a number of hidden initial units (Y) of 3 pieces: Then the weights are the random weight of the interval 0 to 1, adjusted to the input and output intervals containing 0 and 1 (binary).Before the training process was conducted on 60 data, the values on the 10 input units were processed firstly in the normalization stage through the Min-Max method to obtain values with the same interval of 0 to 1.The upper and lower bounds of the value at the 10 input units before it normalized as follows:  The upper bound value referred to in the above table is the reference data obtained from the consultation results in some hospitals and also the range of values that may occur in dengue hemorrhagic cases.If there is a sample data that has more value than that range, then the value will follow the maximum value in the table above.

Result and Discussion
Here are some implementation results of the training, validation, and final testing in this research: From the training result, the maximum accuracy achieved in this stage is only 95%, with hidden units as much as 5 units on the first layer (3 units dropout) and 3 units on second layer (dropout 5 units).For the type of training with fully connected network architecture, the accuracy value obtained maximally only 71.7%, that is when the hidden unit as much as 5 units in the second hidden layer.Although the accuracy obtained in both types of architecture is not 100%, but the resulting weight will be tested in the validation and final testing stages.This is done in order to see the effect of the percentage of training accuracy achieved on network capability later in the final testing phase.In addition, the weights are the optimal result of the training, where the epoch has reached 1000 iterations or 1000 times the iteration.The table above summarizes the results of the validation stage of 40 data applying 12 types of architecture / network schemes, which have an accuracy rate of about 30% to a maximum of 87.5%.From this stage it can be seen that the best percentage is obtained from the network architecture in scheme B6, the same as in the training phase.In other words, the weight of the network training process will greatly affect the accuracy of the introduction of new data patterns.
The better the training results, the better the test results, vice versa (for example in scheme A1).In addition, the dropout mechanism is also able to affect the learning performance, where the neuron/unit will be more selectively chosen by the network to be activated, so that only the best weights are generated by the network.Here is the system display and final test results from 20 data (non-learning): In figure 5 above is a view of the final test result with scheme A1, where from 20 data tested only 4 data can be correctly diagnosed the level (seen from table 7).For fully connected (A1-A6) network schemes, the maximum percentage gained at 60% on A5 schemes with an average accuracy is minimal 37.5%.As for partially connected network schemes (B1-B6) applying dropouts, the maximum percentage gained is 85% in the B6 scheme with an average accuracy of 71.6%.Based on the test results, the best accuracy percentage in recognizing the level of dengue fever is obtained when using partially connected network.If analyzed more deeply, the varying percentage rate can be affected by the number of active or non-active neurons in the first and second hidden layers.In the B1 scheme, 3 neurons in the hidden layer are all activated and produce an introductory percentage of 70%.Scheme B2, 4 of the 5 neurons in the hidden layer are activated and produce an accuracy percentage of 70%.For the B3 scheme, 5 of the 8 neurons in the hidden layer are activated and produce a percentage of 70%.In the B4 scheme, 2 of the 3 neurons in the first and second hidden layers are activated which then produce an accuracy percentage of 75%.While in the B5 scheme, 3 of the 5 neurons in the first hidden layer are activated and all the neurons in the second hidden layer are activated, which in turn produces less accuracy than the B4 scheme, which is 60%.However, in the B6 scheme, 5 of 8 neurons activated in the first hidden layer and 3 of 8 active neurons in the second hidden layer produce a fairly high accuracy of 85%.From all these results, the best network architecture is obtained when the number of neurons is 11 (input) -5 (hidden) -3 (hidden) -2 (output), ie when the number of neurons decreases in both hidden layers.This result is very different when the reduction of neuron occurs only in one hidden layer only (scheme B5).Therefore, it can be concluded that the application of deep neural network method in diagnosis case of dengue hemorrhagic fever has a high success rate of 85% (with average 71.6%) and better than the applied perceptron neural network method in research previously [16] which achieved only the highest accuracy of 80% (with an average of 71.1%).

Conclusion
The conclusions that can be given from this research are: 3. The use of unit dropout technique with a threshold value of 0.5 on the hidden layer can improve the accuracy of pattern recognition, with a fairly high average accuracy difference of 34.1%.This can be because the network is made more selective in selecting information related to the target in which weights will be processed further (to the next layer) and where the information is omitted.So that the established network can be smarter in recognizing the patterns of each level in dengue disease.4. The use of deep neural network methods has better accuracy results than the use of perceptron neural network methods, although it takes more time and epoch to achieve optimal training results.This can be seen from the success of final testing with deep neural network method that can still get a maximum percentage of 85% even though the training results have not reached 100% accuracy.In contrast to previous studies [16] that have achieved optimal training, but final testing is only capable of producing a maximum accuracy of 80%. 5.The use of deep neural network with parameters of 10 units of input, 2 hidden layers, 2 units of output, the value of learning rate 0.3, tolerance rate 0.02, threshold value 0.5, the use of binary threshold activation and partially-connected network with dropout mechanism has a good level of ability to recognize patterns of dengue hemorrhagic fever, with an average accuracy rate of 71.6%.So that the system developed has been applicable for medical purposes.
) c) Matches the output value Yk with target.If it appropriate, then calculated the error obtained: δk = ( tkyk) f , (y_ink) (6) d) Then calculate the new weights W and V where α is the learning rate : Wkj(new) = Wkj(old) + α.δk.Zj (7) Vji(new) = Vji(old) + α.δJ Xi (8) B. Dengue Hemorrhagic Fever Dengue Hemorrhagic Fever (DHF) is an infectious disease caused by dengue virus and is transmitted by Aedes Aegypty and Aeded Albopictus [10].This disease often leads to plague and death is quite high, especially when entering the rainy season, because of the easy spread and has not been found vaccine treatment and prevention until now.DHF virus is small and has 4 serotypes, namely DEN1, DEN2, DEN3, and DEN4.In Indonesia, most dominant serotypes are DEN2 and DEN3, whereas DEN3 to DEN4 is classified as severe dengue cases [9].DHF severity classified into 4 levels, where level III and IV are grouped on Dengue Shock Syndrome (DSS) : Level I : fever with unclear symptoms, bleeding manifestations only in the form of positive tourniquet and easy bruising.Level II : manifestation of level I plus spontaneous bleeding, usually in the form of skin bleeding or other tissue.Level III : circulation failure is a narrow and weak pressure pulse with cold and moist skin.Level IV : there was an initial symptom of shock in the form of low blood pressure, until the pulse pressure can not to be detected.While on laboratory test of blood sample, DHF can be analyzed by looking at Platelet, Hematocrite, Hemoglobin, Lymphocyte, Monocyte, Eosinophil, Leucocyte, Basophil, Erythrocyte, MCV, MCH, MCHC, ESR.Here is an example of blood sample data in DHF patients :

C
. Data Normalization Data normalization is a method for grouping intervals of different values into the same scale.Essential normalization is used to assign equal weight to different data values.Some 60 Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information), volume 12, issue 2, June 2019 normalization method such as Z Score, Decimal Scaling, Softmax, Min-Max.In this research, normalization method used is Min-Max Normalization.Here is the formula [5]: ′() = + ()−() ()−  () * ( − ) +  (9) where : D' (i) = data after normalization (start from i data) D (i) = data before normalization Min (D) = minimum value of data D Max (D) = maximum value of data D U = upper bound value of interval L = lower bound of interval

Figure. 4 .
Figure. 4. Training FormThe above picture is the view of one of the training parameter schema results, which shows the percentage of recognition in only 41.7% of 1000 times iterated.The training is then performed again, but by using different hidden layer and hidden unit values, to be able to produce 100% recognition accuracy or until the epoch has reached the end.All the weights generated from this training are then stored and will be used for the final validation and testing stage.From the training result, the maximum accuracy achieved in this stage is only 95%, with hidden units as much as 5 units on the first layer (3 units dropout) and 3 units on second layer (dropout 5 units).For the type of training with fully connected network architecture, the accuracy value obtained maximally only 71.7%, that is when the hidden unit as much as 5 units in the second hidden layer.Although the accuracy obtained in both types of architecture is not 100%, but the resulting weight will be tested in the validation and final testing stages.This is done in order to see the effect of the percentage of training accuracy achieved on network capability later in the final testing phase.In addition, the weights are the optimal result of the training, where the epoch has reached 1000 iterations or 1000 times the iteration.

TABLE 2 TRAINING
SCHEMA OF NETWORK PARAMETERS

TABLE 3 MAXIMUM
AND MINIMUM VALUE OF INPUT UNIT

TABLE 4 Test
1.The results of the training are directly proportional to the test results, whereas if the training does not succeed to produce maximum accuracy, then the weights used in the testing stage can not produce maximum accuracy, and vice versa.2. The results of training with accuracy that does not reach 100% can also produce a good accuracy.This is evident from the validation stage of 20 dengue patient data in randomized samples that achieved 87.5% accuracy and final test of 20 new non-training data that achieved 85% accuracy from the use of training weights on 60 accurate data by 95%.