Student Academic Mark Clustering Analysis and Usability Scoring on Dashboard Development Using K-Means Algorithm and System Usability Scale

Learning activities are one of the processes of delivering information or messages from teachers to students. SMPN 4 Sidoarjo is a State Junior High School (JHS) located in Sidoarjo Regency. During the learning process, the collected academic score data were still not well organized by teachers and school principals in monitoring student learning performance. The score data is from Bahasa Indonesia subject from a teacher with 222 data included at 2019/2020 school year. The method used in student clustering is K-Means. The number of clusters are determined using the elbow method and displayed in graphic form. Clustering result can be used as a reference for teachers in determining study groups and determining the best treatment for each cluster. The best clustering results are proven by validation score using Davies-Bouldin Index, Silhouette Width, and Calinski-Harabasz Index. Three clusters were obtained for each class level of data, while the cluster ranges from two to five for the data for each study group. The dashboard is used in order to visualize the clustering result. Usability testing using System Usability Scale (SUS) has a score value of 87.5, which means that the dashboard can be accepted by SMPN 4 Sidoarjo.


Introduction
Education is one of the most important things in human resource development [1] and country development [2]. Learning activities are one of the processes of delivering information or news from teachers to students; systematic planning and evaluation are needed both inside and outside the school. Good and appropriate learning activities will affect the quality of student success. The teacher is the main influence because the teacher can adjust the course of learning activities. The success and failure rate of students in learning reflect the quality of education in Indonesia. It always needs information on what areas of education that need to improve, and how to improve them.
Bahasa Indonesia subject is one of the compulsory subjects at all grades in Indonesia. Indonesian as the language of instruction in the world of education. However, the Indonesian people's efforts to pursue progress, the role and dignity of the Indonesian language have been increasingly marginalized [3].
Our study observes SMPN 4 Sidoarjo as sampling data. Based on interviews with Indonesian teachers at SMPN 4 Sidoarjo, teachers and school leaders at the school have not used the collected data to monitor student academic performance because they have a large number of students. Based on the current situation, the teacher will only check the final grade on the transcript to determine the student's understanding of the lesson.
Currently, SMPN 4 Sidoarjo does not have the ability to determine the students' understanding using an advanced method such as clustering. Moreover the current state of COVID-19 makes it difficult for teachers to find out whether the teacher's teaching method is correct or not. From this problem, we collected the data about how the Indonesian language teacher implement clustering issue 2, June 2021 system using dashboard to make it easier for teachers to pay attention to student performance with their periodic academic scores.
The grouping results can later generate recommendations for superior classes in the analysis of each level as well as recommendations in the form of student placement by putting them into several groups and seat placement in the analysis of each study group. Grouping students according to their respective abilities is one factor that can improve the quality of teaching and learning [4]. Previous research used student performance data as a grouping model based on details of international students. This analysis can be done by implementing a data mining task in cluster form using the K-Means algorithm.
There have been many previous studies regarding student clustering using the K-Means algorithm. Li et al. [4] looked for groups of students for group-based learning in foreign language lessons. Sya'iyah et al also present a clustering result on private educational institutions using K-Means Algorithm [5]. Both Li's research and Sya'iyah's research have not used any clustering validation method for the clustering result. Another research was conducted by Arofah & Marisa [6], where researchers applied data mining to determine student's interest in learning Mathematics. Researchers have not used the cluster validation test to find out whether the cluster is in a good performance or not. Those studies have not used the dashboard as a data visualization medium to facilitate stakeholders in knowing the clustering results.
Further research has been conducted by Chen et al. [7]. The research is to determine the evaluation of student's abilities using the K-Means clustering method. The study has not used the dashboard as a data visualization medium as well. The analysis results are used to make it easier for the teaching manager to understand the distribution of student's professional abilities, which aims to develop an appropriate learning plan.
According to Singh et al. [8], clustering is an unsupervised learning method where conclusions are drawn from a set of unlabeled data. The base rule utilized on clustering techniques is to amass the information by amplifying the intraclass probability and interclass probability [9]. The clustering method's effectiveness depends on the nature of the data used [10].
Therefore, this research conducts a process to determine student clusters at SMPN 4 Sidoarjo using K-Means algorithm for clustering. The research questions are: (1) whether the result of implementing K-Means algorithm and validation test are representing the right cluster and (2) whether the teacher's dashboard usability level can be acceptable.

Methods
This research was carried out with several processes described in the flow of the research methodology in Figure 1. We Collect the data on student grades 8 and 9 for the 2019/2020 academic year, including daily test scores, verbal scores, skill scores, midsemester exams, and final exam scores. The data is then carried to Google Sheets which can be accessed online.
This study uses numerical data from daily test scores, midterm exams, final semester exams, knowledge scores, skills scores, verbal scores, and practice scores. It is necessary to carry out the text preprocessing stage to ensure the annotations from the data collection results meet the requirements of the clustering model. The process carried out includes the selection of data to be used for model implementation. After cleaning the data, it is necessary to transform the data by changing the data type according to what is required in the algorithm.
The elbow method is used to determine the number of clusters at the implementation stage; it was found that the initial cluster number was three. Initiation is based on a situation where there is no significant drop in the chart. Therefore, it can be concluded from Figure 2 that according to the elbow method diagram, the initial cluster number is three where the x-axis is the number of clusters and the y-axis is the distortion value. The next phase is clustering using the K-Means algorithm. K-Means is an algorithm that groups object with the same characteristics into a cluster that has been determined by the value of k iteratively [11]. K-means is a fundamental algorithm and one of the easily implemented [12] and popular algorithm [13] [14] allowing quite an extensive research.
In the K-Means algorithm, there are three main categories of investigation in the literature: a selection of the initial centroids (k centroid), acceleration achieved by approximation, and acceleration of exact algorithms [15]. K-Means use distance and centroid as group limit and greater part vote as new raster assurance [16]. The determination of the number of clusters as the initial initiation can be searched using the elbow method. Equation (1) is the Euclidean Distance formula to calculate the distance of each data to the initial centroid point, where and is the value of and on row.
(1) K-Means model is implemented with an optimal cluster value that will produce the dataset's distribution for each cluster. The clustering model was carried out using Python and was carried out on grade 8 and 9 data in total and data for each group class.
The validation test uses Davies-Bouldin index, Silhouette Width, and Calinski-Harabasz. Davies-Bouldin calculated the proximity of the data to the centre point between clusters and the separation was based on the distance between the centre points between the clusters [17]. The purpose of measuring using Davies-Bouldin is to maximize the inter-cluster distance while minimizing the distance between points of each cluster [8]. The minimum Sum of Squared Error (SSE) value can show better clustering results [18]. Equation (2) is a formula to find the value of Davies-Bouldin where ( , ) is the distance between clusters and ; ∆( ) and ∆( ) is the distance between the clusters, and is the number of clusters.
Silhouette Width is the degree of confidence in placing objects for each cluster in the clustering process [11]. Silhouette Width functions to evaluate the validity of a cluster. If the Silhouette value is close to 1, then this value indicates that the data has occupied the right cluster [13]. If the Silhouette Width value has a positive result, this method can be used as a valid measure [19]. Equation (3) is a formula to find Silhouette Width, where is the average value between and other data in a cluster. is the average value between and the nearest cluster.
Calinski-Harabasz Index, commonly known as the Variance Ratio Criterion, is the ratio of the average dispersion between each cluster and the distribution between clusters [20]. The higher the Calinski-Harabasz index value, the more it shows that the clustering results have good performance [20]. Equation (4) is a formula to find the value of Calinski-Harabasz, where is the number of clusters; is the amount of data; is the overall variance value in each cluster. and is the overall variance value between clusters.
The analysis was carried out in two ways; the first analysis analyzed each level where the level indicated the student's grade or class. The second analysis is in each study group, which means that the analysis is carried out at each class level. In this study, using study groups from class A to F. The test is carried out using the Davies-Bouldin Index to illustrate that each cluster has its differences. The Silhouette Width is used to determine whether the data is already occupying the right cluster in the wrong cluster and Calinski-Harabasz to determine the correct cluster value.

Analysis of Clustering Result
In this analysis we used K-Means to find the result of algorithm implementation also validation test using Davies-Bouldin index, Silhouette Width index and Calinski-Harabasz index to represent the right cluster. The value of three is obtained for class 8 and 9 data according to Table 1 and Table 2.  The cluster validation test results indicate that the entire dataset for each level has an optimum cluster value of three.  Based on Table 3 and Table 4, the class 9 dataset shows cluster performance test results which indicate that cluster 3 has the most optimum value for the Class 9 dataset. The clusters with the highest average score can be made into one superior class so that they can be focused on preparing for entry into Senior High Schools. Then, the clusters that have a medium average value can be distributed so that they are evenly distributed. And then, the clusters that have inclusion members to be combined with intermediate cluster members. Elbow and validation result were different from the analysis at each grade level with a small amount of data.
Based on the Elbow method results for the class 8 and 9 class datasets, it was obtained graphical images with different levels of decline for each dataset. From the Elbow value graph, it is obtained a value range of 2-5 to be used as a limit in determining the number of clusters that will be explained in the process of analysis. This research carries out verification activities on stakeholders related to the optimal cluster value range.    Total  Average  1  10  84,28  86,14  85,11  2  3  89,42  91,85  90,9  3  3  75  80  78,19  4  7  85,42  88,71  86,71  5 13 82,14 83,27 83,27   Based on the results of Table 5 to 23, it is found that the inclusion factor causes the formation of a particular cluster for the results of clustering for each study group. From the results of this analysis, it can be directed for teachers to distribute groups evenly, especially for clusters that have the highest average value, so that the distribution can be done evenly in clusters with medium to lower average scores. For clusters that have inclusion and have the lowest average value, arrangements can be made to occupy the front seat.
The cluster validation results using the Calinski-Harabasz Index for each data class did not work well because each data group had one outlier, namely, inclusion student's data. The existence of these outliers can affect the Calinski-Harabasz Index in determining clustering results [20].

Dashboard Visualization and Usability
Testing In this analysis, we tried to find a usability score using System Usability Scale from the dashboard so it can be acceptable by the teacher. The visualization result is created using Google Data Studio and consists of three pages. The dashboard's start page in Figure 3 shows a summary of the overall data starting from displaying the total number of students, the number of male students, the number of female students, and the number of inclusion students. There is a diagram on this page that explains the categories of average student scores, namely Very Good, Enough, and Less. The categories are based on the Minimum Completeness Criteria that applies to the class. Very good has a value range of 85 -100, Enough has a value range of 70 -84, while Less Enough is a category with a value below 70. The first page also displays the average value, lowest score, and the highest value for the attributes of Final Semester Assessment (PAS) and also Mid Semester Assessment (PTS). issue 2, June 2021 The second page in Figure 4 for visualizing the clustering results for each study group also provides a filter feature for the data to be displayed. The page will display clustering data for all classes. This can make it easier to determine teaching decisions in class, such as deciding groups and choosing seats when carrying out the teaching and learning process. The clusters displayed on the dashboard are limited to three per class with stakeholder consent. This is based on the consideration of the division of student grade categories on the first page of the dashboard including "Less", "Enough", and "Very Good" obtained from the validation results with stakeholders. The display on the third page is a visualization page of the clustering results for each level. The graphs and information displayed are similar to the clustering views of each study group, according Figure 5.  In this study, usability testing was carried out using Bahasa version of System Usability Scale (SUS) statements on Bahasa Indonesia teacher as stakeholder. Table 24 is the result of the System Usability Scale (SUS) questionnaire calculation. Based on the SUS questionnaire calculation results by respondents as Bahasa Indonesia teacher at SMPN 4 Sidoarjo, this test resulted in a SUS score of 87,5. The dashboard that has been created is categorized as "acceptable" and indicates that the user can receive the dashboard properly.

Conclusion
Clustering students based on academic scores using the K-Means method can be used as a way for teachers to determine study groups and determine policy directions for each cluster. Cluster validation tests were carried out using the Davies-Bouldin Index, Silhouette Width, and Calinski Harabasz Index from the implementation of K-Means algorithm. Based on the results of the cluster validation test, it was found that the optimal cluster has different values for each dataset. The optimal value in each dataset is in 3 clusters for the results from Davies-Bouldin Index, Silhouette Width, and Calinski-Harabasz Index. However, different results were obtained in the dataset for each study group (class). Each group had the best cluster in the range 2-5 according to the reference elbow method, and there was no similarity in the number of clusters between each class. Usability testing using System Usability Scale (SUS) produces a value of 87.5, indicating that the dashboard is in the acceptable category. Implementation results were visualized using Google Data Studio with a total of three pages to make it easier to read data and make it easier for teachers to read data and monitor student performance properly.