IRN AP06851248 “Development of models, algorithms for semantic analysis to identify extremist content in web resources and creation the tool for cyber forensics”

  • Project leader: Mussiraliyeva Shynar Zhenisbekovna

 

  • The main members of the research team:
  • Omarov B.S.
  • Bolatbek M.A.
  • Baispay G.B.
  • Narbaeva S.M.
  • Ospanov R.K.
  • Medetbek Zh.B.
  • Turarbek A.T.

 

  • The object of the study is the texts and metadata of users of the social networks "Vkontakte", "YouTube", bitcoin transactions.

 

  • The purpose of the research work is to conduct a comprehensive study and develop models, algorithms for semantic data analysis to detect extremist content in web resources, methods for identifying involved users and algorithms for graphical visualization of links, creating and researching a cryptocurrency transaction analysis model to identify suspicious ones, software development ExWeb and cyber-forensic tools for countering extremism.

 

  • Methods of conducting the work - during the study, methods of machine learning, methods of graphical visualization of connections, analysis of demographic attributes and analysis of social networks were used.

 

  • Results and novelty: for the first time, a corpus of extremist texts in the Kazakh language was created for training and testing machine learning methods to identify extremist texts in the Kazakh language; for the first time, taking into account the peculiarities of the Kazakh language, a semantic analysis model was built, which is distinguished by the application of the TF-IDF method to bigrams, previously applied by the stemming algorithm to the embedding layer of the words of the LSTM network and increasing the accuracy of determining extremist texts, a software module for collecting and analyzing web content to determine the extremist orientation. The method can be used as one of the elements for a system for monitoring and collecting data from social networks. User graphs were built based on metadata. A Bitcoin transaction analysis model was developed to detect suspicious transactions. A software application has been developed.

 

  • Scope: Target consumers of the obtained results - fundamental results can be used by the world scientific community; applied results in the form of methodology, algorithms can be used by authorized bodies to ensure information security, critical infrastructure, and counter Internet extremism.