SENTIMENT ANALYSIS ON E-SPORTS FOR EDUCATION CURRICULUM USING NAIVE BAYES AND SUPPORT VECTOR MACHINE

The development of e-sports education is not just playing games, but about start making, development, marketing, research and other forms education aimed at training skills and providing knowledge in fostering character. The opinions expressed by the public can take form support, criticism and input. Very large volume of comments need to be analyzed accurately in order separate positive and negative sentiments. This research was conducted to measure opinions or separate positive and negative sentiments towards e-sports education, so that valuable information can be sought from social media. Data used in this study was obtained by crawling on social media Twitter. This study uses a classification algorithm, Naïve Bayes and Support Vector Machine. Comparison two algorithms produces predictions obtained that the Naïve Bayes algorithm with SMOTE gets accuracy value 70.32%, and AUC value 0.954. While Support Vector Machine with SMOTE gets accuracy value 66.92% and AUC value 0.832. From these results can be concluded that Naïve Bayes algorithm has a higher accuracy compared to Support Vector Machine algorithm, it can be seen that the accuracy difference between naïve Bayes and the vector machine support is 3.4%. Naïve Bayes algorithm can thus better predict the achievement of e-sports for students' learning curriculum.


Introduction
The entertainment industry has always been the center of attention of all parties and is growing rapidly, it is important for all those who consider the entertainment industry to have a very high audience.Nowadays technological developments are no longer focused on content in the broadcast industry such as film, music, stage drama and others.The video game industry is also something that deserves attention, seeing its growing demand and many game application developers who follow a fairly large trend for various groups.Video games were originally only considered a hobby and often ignored, now of all ages and genders around the world playing video games 110 Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information), volume 13, issue 2, June 2020 and has developed into a promising income industry [1].However, from various expectations, it is desirable to have a good perspective, assuming it is always positive, but in reality the activities of all gamers have many positive and negative points of view from the impact of playing games.One of the negative effects that often occur among gamers without supervision from parents can cause addiction from curiosity that is owned by the player, thus causing dependence to play it again [2].
In a study conducted by Syahran on several conflict addiction games found the results of research from a video game expert at Nowingham Trent University in America, Mark Griffiths, almost everyone's daily activities are playing games from all walks of life.Psychology expert in America, David Greenfield, found that about 6% of internet users experience online game addiction, which is more worrying about 7% of playing time at least 30 hours / week, this is because the average child aged 12-18 years often play games online by frequently browsing the internet that is not protected from bad information [3].
Minister of Education and Culture Regulation No. 62 of 2014 concerning Extracurricular Activities to support the achievement of educational goals in Basic Education and Secondary Education, stated that extracurricular activities carried out by students outside of hours of learning, intracurricular activities and curricular activities, under the guidance and supervision of education units, aim to develop potential, talents, interests, abilities, personality, collaboration, and independence of learners optimally [4].
Extracurricular has many forms provided in each school based on the interests and talents of students, such as Flag Raisers (PASKIBRA), Youth Red Cross (PMR), SCOUTS, Mosque Youth Association (IREMA), several art activities such as Modern Dance and Traditional, Choir, Marcing Band, sports activities such as Badminton, Soccer, Futsal, Volleyball, including electronic sports called E-sports (Electronic Sports).Understanding e-sports in general is a branch of virtual sports so that players are not directly involved in which aspects of the sport are facilitated by electronic systems, and carried out online so that each team can compete without face to face [5].
E-sports games are developing rapidly in the field of science and technology and become a determining factor in the world of education.In the current generation, students are very responsive in responding to technological developments, because many online games require players to use good skills in managing strategies, managing teamwork, negotiating and how to make the right decisions, for example online games that are included in e-sports events such as DOTA, Arena of Valor, Free Fire, Point Blank, Mobile Legend, PUBG, PES Soccer, and many more [6].
When juxtaposed with each other in terms of education and online gaming there are indeed many pros and cons.The counter statement in society is that the stigma of children playing games too often makes them forget about time and often ignores the priorities of learning in their education.With this in mind that the character crisis of children caused by technological advances is increasingly easy to access and all digital automatic.Changes in children's play activities that have a lot of interest in modern games (digital) are increasingly acute so that it greatly affects the behavior and habits of children.The phenomenon that emerges is very alarming, affecting children's learning achievement, character crisis and having aggressive behavior, even plunging children into criminal acts that can lead to death [7].
Another opinion says that e-sports does not mean that it is only applied for entertainment, but there are values that have education in students.E-sports will become a useful activity and rich in value when the activity is carried out in a directed and continuous manner, of course with the guidance and supervision from various related parties such as parents and the school [8].
Factors of the problem of addiction playing games greatly affect the youth of students, resulting in a decrease in the level of concentration in the learning process.The e-sports education carried out must have a clear purpose.If it is just entertainment, losing will not be a problem.But it must be accompanied by the aim of becoming a player who is serious about exercising e-sports in the sense of being a professional like a true athlete.Practicing with certain rules, systems, strategic patterns and various supporting aspects for the sustainability of e-sports itself is conducive [9].
The development of e-sports education is not just playing games, but also provides a basic understanding of the ins and outs of the game from the start of making, developing, marketing the e-sports game business, research and other forms of education that can support the interests and talents of students and aim to practice skills and provide adequate knowledge so that students who participate in this activity can have the necessary foundation in fostering character [6].
Various methods are used to find out what causes the addiction of online games to learning achievement.Therefore, with the rapid development of technology and knowledge-based computer systems, it has become part of the study that researchers must be involved in every field of computer science.This research was conducted to help solve problems by using data mining classifications to find out the predictions of esports achievements for student learning curriculum.The need for a method that can process data has been collected from the results of data collection conducted in this study.
In previous studies about the influence of online games for the prediction of learning achievement using the naïve Bayes algorithm, random forest, and C4.5 [2].Problems with aggressive behavior of adolescents in Samarinda explain that changes in behavior are caused by online games played by teenage students [10].The relationship of playing games with the motivation of middle school students in the Bacolod West district is a very significant relationship because online games can make the concentration of learning disrupted [11].The results of online game addiction research in Indonesia show that middle school students spend a lot of time playing online games [12].Online game addiction triggers an impact on attraction, aggressiveness, and interpersonal relationship problems that lead to psychological [13].In other literature which explains that online games are not considered to have a negative impact [14].Online games make Junior High School 1 Kuta students experience a decline in achievement [15].The question arises, whether there is a relationship of active promiscuity, parental guidance, and discipline to learning achievement [16].
This research is increasingly interesting with a combination of education and computer science, namely data mining.Some related studies include: Handling Unbalanced Data in Predicting Churn Customers Using Combined Sampling and Weighted Random Forest [17].Predict the time span of how long students can actively learn with the Random Forest method.Hypertension Prediction System Using Naive Bayes Classifier [18].Nonlinear Methodology for Identifying Seismic Events and Nuclear Explosions Using Random Forests, Support Vector Machines, and Naive Bayes Classifications [19].Prediction of Timeliness of Graduation Students Using Naïve Bayes: Case Study at Syarif Hidayatullah Islamic State University Jakarta [20].Student Academic Performance Evaluation Using the Naïve Bayes Algorithm (Case Study: Fasilkom Unilak) [21].Naive Bayes Method for Graduation Prediction (Case Study: New Student Data) [22].Data mining to predict the type of transaction in cooperative loans with the C4.5 algorithm [23].Decision Tree-based decision support system in providing scholarship case studies: AMIK "BSI yogyakarta" [24].Sentiment analysis of public opinion on forest fire news through comparison of Support Vector Machine algorithm and k-nearest neighbor based particle swarm optimization [25].Application of C4.5 algorithm based on particle swarm optimization for ease of service programs predicted results of donations tithes [26].Implementation of Naïve Bayes Algorithm, Random Forest.C4.5 about Online Games for Prediction of Learning Achievement with MAN 4 Karawang student research objects [2].
The data on the website and social media is very much, so it is very difficult to detect sentiment [32].Twitter users create their own words and by using spelling and punctuation, making misspellings, using slang words, new words, adding url, and special terms and abbreviations according to their age classification.Thus, such texts demand to be corrected.So to analyze the characters of HTML text, slang words, emoticons, words that have no meaning, punctuation, and url need to be deleted [33].
Data on Twitter social media becomes very interesting material to be analyzed because of several things including, 1) most e-sports players have an account and are active on social media, 2) many e-sports tournament information that publicly convey their ideas and ideas as well provide comments on policies issued by related parties, 3) the community freely submits ideas, support, responses and criticisms to the government regarding e-sports policies in the world of education.
Opinion Mining (OM) and Sentiment Analysis (SA) are two emerging fields that aim to help users find opinion information and detect sentiment polarity.OM and SA are generally used interchangeably to express the same meaning.However, some researchers state that they aim to overcome two problems slightly different [34].Sentiment Analysis builds a system that tries to identify and extract opinions in texts [5].This highlights the classification of texts and relates to the extraction of texts.Usually sentiment polarity is classified as positive, negative or neutral class [36].
Sentiment can be found in various mass media lines, such as: Facebook, tweeters or comments on a product to provide useful indicators for various purposes.It also states that a sentiment can be categorized into two groups, namely negative and positive words.Sentiment analysis is a natural language processing 112 Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information), volume 13, issue 2, June 2020 technique for measuring opinions or sentiments expressed in tweet choices [37].Sentiment analysis with Twitter has recently become a popular method for organizations and individuals to monitor public opinion of their brands and businesses.One of the main challenges that must be faced by Twitter sentiment analysis method is the noisy nature of the data generated by Twitter.Twitter only allows for 140 characters in each post, which affects the use of abbreviations, irregular expressions, and rare words.This phenomenon increases the level of data sparsity, which affects the performance of Twitter sentiment classifiers [34].The well-known method for reducing textual data noise is the elimination of stop words.This method is based on the idea that discarding non-discriminatory words reduces the classifying feature space will get more accurate results [38].Therefore, it is necessary to have a pre-processing process that aims to process text into a standard form so that it is easy to process further.The results of the process of pre-processing is a basic word that is not a collection of stop words.Stop words in sentiment analysis are words that have weak semantics and sentiments in the context [39].
Twitter Sentiment Analysis can be one powerful tool for analyzing valuable reflections from public perception.Twitter sentiment analysis has attracted a lot of attention because of the rapid growth in Twitter's popularity as a platform for people to express their opinions and attitudes towards various topics.Twitter's sentiment analysis approach tends to focus on identifying individual tweet sentiments [40].
Sentiment analysis is the process of extracting and processing data automatically using certain algorithms to get sentiment information contained in an opinion sentence [41].Sentiment analysis refers to a general method for extracting polarity and subjectivity from semantic orientation which refers to the strength of words and text or polarity phrases [42].Analysis of social media data can help businesses, governments, security organizations and the environment to find out people's problems, suggestions and criticisms and find the right solutions for their problems.Analyzing social media data needs to adapt to new methods and tools, it also needs a better understanding of people's opinions and their criticisms and insights.Sentiment analysis is a field of research that emerged from Natural Language Processing (NLP) to extract people's opinions, thoughts, and views [43].
Among supervision machines, SVM is a popular learning machine that can learn from training data and classify vectors in features into one or two groups [29].Vector Machine Support determines the linear lines (hyperplane) that separate data into categories by calculating the longest distance between support vector categories (data) [30].NB is a learning algorithm that is based on Bayes theory using strong assumptions.Bayes theory is a theory about finding the highest probability of something based on existing data [31].
This research was conducted by applying the data mining method to compare with the Naive Bayes method and support vector machine to find the algorithm that has the highest accuracy, in terms of predicting online game addiction about the achievement of e-sports for students' educational curriculum which has never been done in research.

Methods
In this paper CRISP-DM stands for Cross-Industry Standard Process Model for Data Mining.Generally explained about the data mining process in six stages of Business Understanding, Understanding, Data Preparation, Data Modeling, Evaluation, Deployment, can be seen in Figure 1.

Business Understanding
In this study, it is part of understanding the research topic that is being carried out at the stage of understanding the work.In this research, the subject of the research is understood by digging information on social media Twitter using individual electronic mathematical key structures, graphics on the fast-point vehicle to hang the object in tweets.The motivation at this point is that the tweets provided are usually in the form of text in computerized media, grouped according to the content of the discussion for each comment category.Online media is not only a way to read the main page of the article, it can also be used to see the problems that arise and even to see the training options.This sentiment analysis is done to find a classification method that can help identify positive and negative news article comments.At this point, it is understood that finding the best classification method can help in processing the information to be made by comparing the results of the algorithm used and improving the performance of the classification method that can be made using the selected features.

Data Understanding
In the next step, the raw extraction formulation is carried out according to the required characteristics.Through the main structure dedicated to e-sports, information is obtained from social networking sites on Twitter.Information was collected from 2 May 2020 to 11 May 2020.Initial information obtained was 15,000 comments Information from Tweet Comments.After gathering information, a cleanup group was conducted, because the information was still random and there was some information that was not appropriate for the research content, specific information was only suitable for sports content as a training curriculum, and an information survey was carried out using the Ms. program.Expecting Expectations 2010 uses a combination (uppercase / lowercase format) to delete duplicate information, differentiate between uppercase and uppercase letters, and then clean with a faster program with the document annotation tool.General information obtained later was 8,453 comments.This search uses Indonesian commentary information.

Data Preparation
The information preparation phase is a combination phase of information preparation aimed at obtaining information that is clean and ready for use in research.In the initial stages of text extraction, the introductory text phase will be applied; at this point, the researcher will use the RapidMiner tool.At this point, the researcher will create many preprocessing text structures in the commentary dataset, including case conversions, symbols, long filtering filters, keyword channels.The discussion of these stages will be explained in more detail in the next section.

Modelling
It is the stage of selecting data mining techniques by determining the algorithm to be used.This research uses tools that are used to do modeling in accordance with predetermined techniques, these tools are RapidMiner version 8.2.This study uses 2 classification algorithms as its model [28].The classification algorithm used is Naïve Bayes (NB) and Support Vector Machine (SVM).The test results for each model are to categorize positive tweet articles and negative tweet articles to achieve the best accuracy value in each algorithm.

Evaluation
The evaluation phase aims to determine the usefulness of the model that was successfully created in the previous modeling step.In this study, the evaluation phase is used in conjunction with the Synthetic Minority Over-Sampling Technique (SMOTE).RapidMiner version 8.2 is used to help compare SMOTE, vector drive algorithms and artificial minority algorithms and support machines to find two different grouping methods between Naïve Bayes algorithms and data sets and compare them with Naïve Bayes algorithms.In this study, the purpose of using the SMOTE technique is to increase the accuracy value of the results obtained from the algorithm method, and thus to find the best algorithm method for this research [28].

Deployment
The deployment stage is the stage used to create an implementation model that is created in a tool that can be built with various types of programming.Making this implementation model uses the results of the experimental and evaluation process as a source of reference data.Deployment used in this research is the implementation phase of the results of the comparison of 2 algorithms, then of the 2 algorithms that have the highest accuracy value can be used as a development material in predictions of online game addiction about the achievement of e-sports for students' learning curriculum.

Weighting Word
Property, called the word weight or weighting property, is a combination that evaluates each attribute based on its relevance and its impact on the classification results [31].This value can then be used as a basis for determining features based on the lowest weight calculated from each feature.This is weighted using the TF-IDF (Term Frequency -Inversion Document Frequency) method.The TF-IDF algorithm is one of the algorithms in the text extraction weighting feature.TF is a repetition of terminology in 114 Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information), volume 13, issue 2, June 2020 additional documents.The higher the number of terms (high TF) in the document, the higher the weight or the higher the match value.The type of TF equation commonly used in calculations is pure TF (raw TF).Pure TF (raw TF), the TF value is given depending on how often the term appears in the document.For example, if it happens five (5) times, the individual structure will be a value of five (5).IDF (Inversion Document Frequency) is an account term that is commonly distributed in registered documents.IDF shows relationships as terms in the document.The lower the number of documents containing the required requirements, the more valuable the IDF.The frequency of inverse documents (IDF) is calculated using equation, can be seen in equation ( 1) [31].
In equation ( 1) where D is the number of all documents in the collection while dfj is the number of documents containing the term (TF) [31].
In equation ( 2) Thus the general formula for TF-IDF Term Weighting is a combination of the standard TF calculation formula with the IDF formula by multiplying the TF value with the IDF value.

Synthetic Minority Over-Sampling Technique
In previous studies Synthetic Minority Over-Sampling Technique (SMOTE) balanced the dataset by synthesizing minority data synthetically in the input space based on their environmental information.The training data set consists of minority data points (Smin) and majority data points (Smaj).For each (Xi, Yi) ∈ Smin, most data points are set (Smaj) [28].For each (Xi, Yi) ∈ Smin, SMOTE generates a new minority data point along the joining line segment (Xi) and one of the closest neighbors chosen at random.SMOTE can be seen in equation (3) [28].

Results and Analysis
This research uses data taken from tweets comments on Twitter social media related to esports as mentioned in the understanding data section above.The data taken as a whole is 15000 comment data presented in Figure 2. Then the data will be done in the initial stages of data cleansing, data cleaning is done using Ms. software.Excel 2010 using the process of removing duplicate data, to remove between capital letters and non-capital letters can be distinguished (match case), and then cleaning is performed using RapidMiner software using the process document from data tool which then makes all small letters (transform cases use lower case), delete words that are less than 3 letters (filter token by length), then all words and symbol characters or special characters that are not needed in each document are collected and removed such as then specify the username mentioned (username), delete hashtag (#hastag), delete punctuation, delete numbers because only text data is used, delete link (http: //), delete special characters such as symbols or expressions, delete words foreign words Only need Indonesian words (tokenizing use regular expressions) and are administered to the label (class) manually with the help of Indonesian language experts using techniques using multiple labels to label large data (crowdsourced labeling) with positive or negative labels on each comment.From the initial stage, 8453 comments of data were labeled, so the data will be the dataset in this study.In processing the data to get a model that fits the case of this research, namely, the sentiment analysis of online game addiction predictions about the achievement of e-sports for student learning curriculum using the Naïve Bayes classification algorithm and Support Vector Machine, a RapidMiner tool version 8.2 is used.Because this research belongs to the part of textmining, there will be a stage that must be done first before a good model can be found in the case study of e-sports analysis sentiments for the education curriculum.So that the research carried out can be traced and can be re-tested, the steps to be carried out in this study are outlined in a research framework model.The stages in this framework model will be used as a reference during the research process.The framework model in this study is presented in Figure 3.

Pre-processing
The discussion at this stage is the initial process of processing the dataset before it can be processed for classification using the Naïve Bayes algorithm (NB) and Support Vector Machine (SVM) with Synthetic Minority Over-Sampling Technique (SMOTE).This study uses several stages of preprocessing for the comment text dataset, the following are the steps in Figure 4.

Transform Case
At this stage Transform Case on RapidMiner.This is used to convert all capital words to lowercase letters.The results of the conversion status gallery can be seen in Table 1.

Tokenizing
At this point, continue with the combination of RapidMiner Tokenize.This is used in such a way that all special structures are independent structures and mention usernames (usernames), delete hashtags (#hastag), delete punctuation marks and delete unnecessary symbols or special characters in each document.Only delete special characters, such as text used, delete links (http: /), emoticons or emojis, delete individual structures.Foreign structures are only assigned because they take on individual organizations in Indonesia.For individual organizations, institutions designated with "no" in the future will be normalized using the underscore "_" to achieve a clear meaning, for example, "not assigned" means "not good" because the designated structure does not mean that the assigned structure is negative.Can be seen in Table 2.After rt @oreoqueenos: di kelas kalian ada game apa aja ?di kelas ku ada uno kartu , uno stacko , moba , pubg sama scrabble ?? yang mau liat list juara sekolah kita disini https://games.grid.id/read/151616294/inilah-parapemenang-kompetisimobile-legends-diturnamen-next-2019?page=all di kelas kalian ada game apa aja di kelas ku ada uno kartu uno stacko moba pubg sama scrabble yang mau liat list juara sekolah kita disini gamesgrididread151616294i nilahparapemenangkompetis imobilelegendsditurnamenn ext2019pageall

Filter Token by Length and Filter Stop words
The crawling stage is the individual creation phase for the custom organization of token results.

Model Classification
The next step in this research is to make a model using a classification algorithm for the comment text dataset that has gone through the preprocessing stage.This stage uses two classification algorithms, namely Naïve Bayes (NB) and Support Vector Machine (SVM) with Synthetic Minority Over-Sampling Technique (SMOTE).This study uses the RapidMiner version 8.2 tool to process the comment text dataset that has gone through the data preparation stage with text pre-processing.The first stage of this process is that the comment text data will be uploaded into the tool by using an excel file which will then be processed with the Naïve Bayes algorithm (NB) and Support Vector Machine (SVM) to get the initial results of each algorithm, such as can be seen in Figure 2.After the first stage is carried out, this study continues by comparing the two algorithms by adding the Synthetic Minority Over-Sampling Technique (SMOTE) algorithm.The step of using SMOTE in the modeling process aims to increase the value of the accuracy of the classification results of NB and SVM algorithms, the process can be seen in Figure 5.

Evaluation Model Classification
The evaluation phase aims to determine the ease of use of the model that was successfully created in the previous step.10 times cross validation is used for evaluation.In the test shown in Figure 6, this number is arranged in 10.Therefore, the data set is divided into 10 regions, and each direction provides the same percentage of information for each type of information.The information used is clean and pre-made information.This information is taken from the Read Excel manager, and this is done because the dataset is stored in Excel.Handling documents from notes to convert documents into documents.Verification data consists of training information and testing information.At this point, SMOTE is used.Destructive sampling is used to balance information.For managers, "cross validation" is used to classify and evaluate agitation analysis through a 10-fold verification experiment.Based on the research data using the heat load matrix in Table 4 and Table 5, the value of accuracy, precision and recall made by SMOTE Sampling is shown can be seen in Table 6 and Table 7.
As for the comparison of accuracy and curve (AUC) regions, the results of the algorithm used can be seen in Based on Table 9 we can know that the accuracy of the Naïve Bayes Algorithm method value 50.74% shows that the accuracy obtained is included in the quite good category.And the accuracy results after using the SMOTE Upsampling in Table 10 there is an increase to 70.50% shows that the accuracy results obtained, the NB algorithm method is very appropriate to use SMOTE optimization included in either category.Based on Table 11.We can know that the accuracy of the Support Vector Machine Algorithm method value 70.50% shows that the accuracy results obtained are included in either category.And the accuracy results after using SMOTE Up-sampling in Table 12 there is a decrease to 66.92% shows that the accuracy results obtained, the SVM algorithm method is not appropriate to use SMOTE optimization the results show are quite good.Figure 7 illustrates a graph of the Area Under Curve (AUC) optimistic result of the validation of the Naïve Bayes Algorithm method of 0.771.We can know that the results of this AUC show the acquisition of values that fall into the average category (0.70-0.80).And the results of the validation of the Area Under Curve (AUC) graph after using the SMOTE Up-sampling in Figure 8 there was an increase to 0.954, this shows that the results obtained, the NB algorithm method is very appropriate using SMOTE optimization the results show are included in the very good category (0.90-1.00).
Figure 9 illustrates a graph of the Area Under Curve (AUC) optimistic result of the validation of the Support Vector Machine Algorithm method of 0.508.We can know that the results of this AUC show the acquisition of values that fall into the failure category (0.50-0.60).And the results of the validation of the Area Under Curve (AUC) graph after using the SMOTE Up-118 Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information), volume 13, issue 2, June 2020 sampling in Figure 10 there was an increase to 0.832, this shows that the results obtained, the NB algorithm method is very precise using SMOTE optimization the results show are included in either category (0.80-0.90).Based on the recapitulation analysis shows that the evaluation using the NB algorithm with SMOTE is the best solution to get the highest accuracy and AUC values.

Conclusion
From the results of research on the application of sensitivity classification algorithms for e-sports analysis of the training curriculum, it can be concluded that the Naïve Bayes Algorithm method, called the accuracy value of 70.32%, 64.41% precision and 90.36% recall results.Note that when the Support Vector Machine Method Algorithm is optimized using the SMOTE method, the accuracy value is 66.92%, the precision is 61.31% and the recall value is 92.15%.Based on the results of the study, it can be concluded that the Naïve Bayes algorithm has a higher accuracy than the Support Vector Machine algorithm, so that the difference in accuracy between Naïve Bayes and the Support Vector Machine can be observed at 3.4%.Thus, the Naïve Bayes algorithm can predict better when analyzing sentiment on e-sports for student learning curriculum.
As for suggestions for the continuation of this study, it is expected that in subsequent studies, more records are used so that comparisons on accuracy can be better.It is also hoped that further research can be developed with different methods or develop this research with optimization methods from compared algorithms such as Particle Swarm Optimization or other optimization methods.

Figure 4 .
Figure 4.The stage in Pre-processing

TABLE 2 EXAMPLE
OF TOKENIZING Before

Ilmu Komputer dan Informasi (Journal of Computer Science and Information), volume
The stop list algorithm can be used (the least important individual institution can be deleted) or the list of individual organizations (can save each of the organizations of interest).At this point, the 116 Jurnal 13, issue 2, June 2020 individual structure of the customs company used (in Indonesia) must be normalized to a standard form, for example "ke" to.The results of the filter formulation can be seen in the Table3.

Table 8 .
R. Ardianto, et.al., Sentiment Analysis on E-Sports for Education Curriculum