Humans and bots web session identification using k-means clustering
Denial of Service (DoS) attacks are one of the most damaging attacks on the Internet security today. This is done by forcing some computers to perform several disturbing tasks to make the machine or network resource unavailable to its intended users. Many major companies have been the focus of it. Because this attack can be easily engineered from nearly any location, finding those responsible can be extremely difficult. Since the web crawlers have a big share on browsing the Internet websites recently, that makes the target subject to being attacked by them. This study presents an analysis for a web log file that records all server requests during the time span of 30 June 2018 to 30 July 2018 obtained from a popular university website to generate the web sessions during this period to easily identify sessions behaviors, and then the K-means algorithm is used for clustering the output of this analysis, based on these behaviors. This will help us in using the assumptions set on this work to allow labeling for each cluster to human visitors, benign crawlers, malicious crawlers and unknown requests.
Faculty of Computer Science and Artificial Intelligence
Physical Sciences, General Computer Science, General Engineering
Indexed in Scopus
Clustering, Data mining, K-means, Web usage mining
Medhat, Muhammad; Hassan, Yasser Fouad; and Elsayed, Ashraf, "Humans and bots web session identification using k-means clustering" (2019). Faculty of Computer Science & Artificial Intelligence. 3.