A graph proximity feature augmentation approach for identifying accounts of terrorists on twitter

Ahmed Aleroud; Nisreen Abu-Alsheeh; Emad Al-Shawakfa

doi:10.1016/j.cose.2020.102056

A graph proximity feature augmentation approach for identifying accounts of terrorists on twitter

Ahmed Aleroud, Nisreen Abu-Alsheeh, Emad Al-Shawakfa

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

With the popularity of social networks, terrorist groups such as ISIS encouraged others to follow their activities, share their ideas, recruit fans, radicalize communities, and raise funds to support future attacks. This has led to the emergence of radicalized online accounts that belong to terrorists or their fans. Existing techniques for counter-terrorism investigations which aim to suspend such accounts are based on reports by users or syntactic-based sentiment analysis techniques, which are not accurate on short texts shared by terrorist such as tweets. This work proposed a feature augmentation approach to enrich the content of tweets before investigating them to discover the radicalized online contents. The augmented tweets are then used to classify accounts into Pro-ISIS or Anti-ISIS categories. We utilized topic modeling as a baseline method for feature augmentation. We studied the effects of utilizing tweets at different time intervals on the quality of the generated models that classify tweets and the corresponding accounts. We then introduced a novel feature augmentation approach that utilizes Neighborhood Overlap, a graph proximity technique that discovers terms having a strong relationship with the Pro-ISIS category. Terms extracted from tweets are represented as nodes in a graph, which is then partitioned into clusters containing different terms. Terms in strongly connected parts of each cluster are augmented to the original term vectors of the tweets based on the similarity between those terms and each tweet. We compared our approach with other baseline augmentation techniques such Term-to-Term correlation, Topic Modeling, and other existing techniques. Experimental results on a dataset containing Pro- and Anti-ISIS tweets show that our approach is quite promising to automate the identification of terrorist contents online. The results have shown that using graph proximity measures such as Neighborhood Overlap for term augmentation gains higher Precision, Recall, and F-measure than the typical approaches. In addition, we found that applying time-based analysis with term augmentation to identify radicalized accounts enhanced the Precision of the investigation process.

Original language	English (US)
Article number	102056
Journal	Computers and Security
Volume	99
DOIs	https://doi.org/10.1016/j.cose.2020.102056
State	Published - Dec 2020
Externally published	Yes

Keywords

Feature augmentation
Graph Neighborhood
Latent dirichlet allocation (lda)
Social network analysis
Temporal analysis
Terrorism Informatics

ASJC Scopus subject areas

General Computer Science
Law

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1016/j.cose.2020.102056

Cite this

@article{fd54c3d0e623400c87e6d4489d15e9c0,

title = "A graph proximity feature augmentation approach for identifying accounts of terrorists on twitter",

abstract = "With the popularity of social networks, terrorist groups such as ISIS encouraged others to follow their activities, share their ideas, recruit fans, radicalize communities, and raise funds to support future attacks. This has led to the emergence of radicalized online accounts that belong to terrorists or their fans. Existing techniques for counter-terrorism investigations which aim to suspend such accounts are based on reports by users or syntactic-based sentiment analysis techniques, which are not accurate on short texts shared by terrorist such as tweets. This work proposed a feature augmentation approach to enrich the content of tweets before investigating them to discover the radicalized online contents. The augmented tweets are then used to classify accounts into Pro-ISIS or Anti-ISIS categories. We utilized topic modeling as a baseline method for feature augmentation. We studied the effects of utilizing tweets at different time intervals on the quality of the generated models that classify tweets and the corresponding accounts. We then introduced a novel feature augmentation approach that utilizes Neighborhood Overlap, a graph proximity technique that discovers terms having a strong relationship with the Pro-ISIS category. Terms extracted from tweets are represented as nodes in a graph, which is then partitioned into clusters containing different terms. Terms in strongly connected parts of each cluster are augmented to the original term vectors of the tweets based on the similarity between those terms and each tweet. We compared our approach with other baseline augmentation techniques such Term-to-Term correlation, Topic Modeling, and other existing techniques. Experimental results on a dataset containing Pro- and Anti-ISIS tweets show that our approach is quite promising to automate the identification of terrorist contents online. The results have shown that using graph proximity measures such as Neighborhood Overlap for term augmentation gains higher Precision, Recall, and F-measure than the typical approaches. In addition, we found that applying time-based analysis with term augmentation to identify radicalized accounts enhanced the Precision of the investigation process.",

keywords = "Feature augmentation, Graph Neighborhood, Latent dirichlet allocation (lda), Social network analysis, Temporal analysis, Terrorism Informatics",

author = "Ahmed Aleroud and Nisreen Abu-Alsheeh and Emad Al-Shawakfa",

note = "Publisher Copyright: {\textcopyright} 2020",

year = "2020",

month = dec,

doi = "10.1016/j.cose.2020.102056",

language = "English (US)",

volume = "99",

journal = "Computers and Security",

issn = "0167-4048",

publisher = "Elsevier Limited",

}

TY - JOUR

T1 - A graph proximity feature augmentation approach for identifying accounts of terrorists on twitter

AU - Aleroud, Ahmed

AU - Abu-Alsheeh, Nisreen

AU - Al-Shawakfa, Emad

PY - 2020/12

Y1 - 2020/12

N2 - With the popularity of social networks, terrorist groups such as ISIS encouraged others to follow their activities, share their ideas, recruit fans, radicalize communities, and raise funds to support future attacks. This has led to the emergence of radicalized online accounts that belong to terrorists or their fans. Existing techniques for counter-terrorism investigations which aim to suspend such accounts are based on reports by users or syntactic-based sentiment analysis techniques, which are not accurate on short texts shared by terrorist such as tweets. This work proposed a feature augmentation approach to enrich the content of tweets before investigating them to discover the radicalized online contents. The augmented tweets are then used to classify accounts into Pro-ISIS or Anti-ISIS categories. We utilized topic modeling as a baseline method for feature augmentation. We studied the effects of utilizing tweets at different time intervals on the quality of the generated models that classify tweets and the corresponding accounts. We then introduced a novel feature augmentation approach that utilizes Neighborhood Overlap, a graph proximity technique that discovers terms having a strong relationship with the Pro-ISIS category. Terms extracted from tweets are represented as nodes in a graph, which is then partitioned into clusters containing different terms. Terms in strongly connected parts of each cluster are augmented to the original term vectors of the tweets based on the similarity between those terms and each tweet. We compared our approach with other baseline augmentation techniques such Term-to-Term correlation, Topic Modeling, and other existing techniques. Experimental results on a dataset containing Pro- and Anti-ISIS tweets show that our approach is quite promising to automate the identification of terrorist contents online. The results have shown that using graph proximity measures such as Neighborhood Overlap for term augmentation gains higher Precision, Recall, and F-measure than the typical approaches. In addition, we found that applying time-based analysis with term augmentation to identify radicalized accounts enhanced the Precision of the investigation process.

AB - With the popularity of social networks, terrorist groups such as ISIS encouraged others to follow their activities, share their ideas, recruit fans, radicalize communities, and raise funds to support future attacks. This has led to the emergence of radicalized online accounts that belong to terrorists or their fans. Existing techniques for counter-terrorism investigations which aim to suspend such accounts are based on reports by users or syntactic-based sentiment analysis techniques, which are not accurate on short texts shared by terrorist such as tweets. This work proposed a feature augmentation approach to enrich the content of tweets before investigating them to discover the radicalized online contents. The augmented tweets are then used to classify accounts into Pro-ISIS or Anti-ISIS categories. We utilized topic modeling as a baseline method for feature augmentation. We studied the effects of utilizing tweets at different time intervals on the quality of the generated models that classify tweets and the corresponding accounts. We then introduced a novel feature augmentation approach that utilizes Neighborhood Overlap, a graph proximity technique that discovers terms having a strong relationship with the Pro-ISIS category. Terms extracted from tweets are represented as nodes in a graph, which is then partitioned into clusters containing different terms. Terms in strongly connected parts of each cluster are augmented to the original term vectors of the tweets based on the similarity between those terms and each tweet. We compared our approach with other baseline augmentation techniques such Term-to-Term correlation, Topic Modeling, and other existing techniques. Experimental results on a dataset containing Pro- and Anti-ISIS tweets show that our approach is quite promising to automate the identification of terrorist contents online. The results have shown that using graph proximity measures such as Neighborhood Overlap for term augmentation gains higher Precision, Recall, and F-measure than the typical approaches. In addition, we found that applying time-based analysis with term augmentation to identify radicalized accounts enhanced the Precision of the investigation process.

KW - Feature augmentation

KW - Graph Neighborhood

KW - Latent dirichlet allocation (lda)

KW - Social network analysis

KW - Temporal analysis

KW - Terrorism Informatics

UR - http://www.scopus.com/inward/record.url?scp=85092238247&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85092238247&partnerID=8YFLogxK

U2 - 10.1016/j.cose.2020.102056

DO - 10.1016/j.cose.2020.102056

M3 - Article

AN - SCOPUS:85092238247

SN - 0167-4048

VL - 99

JO - Computers and Security

JF - Computers and Security

M1 - 102056

ER -

A graph proximity feature augmentation approach for identifying accounts of terrorists on twitter

Abstract

Keywords

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this