A System to Study Anti-American Misinformation and Disinformation Efforts on Social Media – OSCAR Celebration of Student Scholarship and Impact

Author(s): Gowri Prathap, Luke Palmieri, Ekrem Kaya

Mentor(s): Hamdi Kavak, Computational and Data Sciences

Abstract

Misinformation and disinformation are two significant challenges of our century with societal, political, and economic implications. This study focuses on building a software system to investigate the role of social media in instilling anti-American sentiment among US allies through misinformation and disinformation efforts. Our system has four major components, which are executed stepwise: (1) Data collection, (2) Data handling, (3) Machine learning, and (4) Analysis. We designed and implemented this system for Twitter using the Python ecosystem. As a use case, we selected Turkey – a US ally and NATO-member country with notable support of anti-American views. We automatically translated the tweets into English and used sentiment and emotion analysis to determine support for or opposition to the USA. Then, we categorized people into bots and non-bots. From Jan 2019-Dec 2021, there were 11,988,406 Turkish tweets related to the USA. Our data showed several peaks, such as President Biden’s inauguration day on January 20, 2021 and Biden’s recognition of the Armenian Genocide on April 24, 2021. Turkish tweets against the United States are dominated by disgust, followed by anger and fear.

Audio Transcript

Hello. My name is Gowri Prathap, and I am a senior majoring in Computational and Data Sciences at George Mason University. I work with Luke Palmieri, Ekrem Kaya, Alex Korb, Dr. Saltuk Karahan, and Dr. Hamdi Kavak as part of this research project funded by the Commonwealth Cyber Initiative. Other partner institutions in this project are Old Dominion University and Tidewater Community College. The goal of this study is to investigate the social media effect in creating anti-U.S. perceptions through misinformation and disinformation efforts within the U.S. allies. The target country for the case study is Turkey, and the social media we are analyzing is Twitter. This presentation summarizes our efforts in the last eight months. This research develops a software system to investigate the influence of misinformation and disinformation on social media in instilling anti-US sentiment among allies. There are four parts to our system. The first step is to collect data. Our use case is Turkey, a U.S. ally with a significant anti-American sentiment. By providing precise Turkish keywords relevant to the United States, we were able to gather Twitter data using Twitter’s Filtered stream API. The handling of data is the next phase. The information is saved in a database, and the text is translated into English automatically using Google Translate API. Machine learning is the next phase. Sentiment and emotion analyses are performed on Twitter data to determine the anti-US stance. Emotion analysis analyzes emotions such as happiness, anger, and fear, while sentiment analysis assesses whether a text represents positive, negative, or neutral feelings. The Twitter accounts are then classified as bots or non-bots. The analysis is the last phase in our process which requires subject matter expert involvement. We used sentiment and emotion analysis algorithms after collecting and managing the data. Afinn, Textblob, and Vader were used to conduct sentiment analysis and provide the sentiment score of a tweet. We used the Python packages Pysentimiento and Text2Emotion to conduct emotion analysis. After that, we performed Twitter user analysis using the TweetBotOrNot package to categorize accounts into bots and non-bots. This package could identify a portion of the users and their bot or non-bot categorization. We utilized a machine learning model to forecast the probabilities of the remaining users. We collected the parameters relevant to our research from the parameters returned by Twitter user data. Our approach used feature importance to determine which attributes would have the most impact on the model. To determine the best parameters for random forest regression, our method performed hyperparameter optimization. Here we visualize a time series of the number of Turkish tweets related to the USA from January 2019 to December 2021. There are almost 12 million unique tweets from 2019 to 2021. On an average day, the number of tweets per day is above 10000. However, we can see peaks in our data, such as from October 7th to 16th, 2019, due to the U.S. withdrawal of troops from Syria, Turkey’s invasion of northern Syria, and Trump’s announcement of sanctions on Turkey as a result. Another peak was from January 3rd to 12th, 2020, due to the killing of Qasem Soleimani and the Iranian retaliation. Erdoğan expressed concern about instability in Iraq as a result of the attack. Another important peak was from November 4th to 9th, 2020, due to the U.S. Presidential Election and the peak of the Azerbaijani-Armenian war over Nagorno-Karabakh. Another peak was from January 6th to 7th, 2021, due to Turkish users responding to the Capitol riots. This is the sentiment score time series, and similarly, we can see the drop in sentiment during the U.S. withdrawal of troops from Syria and the killing of Qasem Soleimani. Here we visualize the emotion analysis. In the Pysentimiento emotion analysis, Turkish tweets about the United States are dominated by disgust, followed by anger, fear, joy, sadness, and surprise. In the emotion analysis conducted by Text2Emotion, the prominent emotion is fear, followed by surprise, sadness, happiness, and anger. Here we visualize the analysis of tweets by bots and non-bots. Bots appear to generate more negative tweets against the United States. In addition, they send out more tweets on average than non-bots. In comparison to non-bots, bots have a lower average positive sentiment and a greater average negative and neutral sentiment. The average happiness and sadness scores of non-bots are higher than those of bots. Bots score higher on average in terms of anger, fear, and surprise than non-bots. This means that bots are more likely to send tweets about the United States that are filled with anger, fear, and surprise. In this study, we presented a software system for analyzing misinformation and disinformation attempts on social media in Turkey, a U.S. partner country, in order to promote and encourage anti-American beliefs. Between 2019 and 2021, we used this method to detect a number of political events and understand how the social media audience reacted to them. In general, there are a lot of negative feelings against the United States. Fear, surprise, and sadness are the most common emotions. Our approach can be a valuable tool for researching anti-American sentiments and a starting point for studying misinformation and disinformation campaigns. As a further phase, we will dig deeper into the political events and raise the level of automation in our misinformation and disinformation process.
Thank you for listening.

4 replies on “A System to Study Anti-American Misinformation and Disinformation Efforts on Social Media”

Thank you for your presentation. Social media literacy is so important!

Hello,
Thank you for your comment, I agree! Thank you for watching as well.

~ Gowri Prathap

Nice work. Alarming but perhaps not surprising result that bots are more prolific and more “negative” than actual people.

This was a fascinating presentation. Thank you so much for this important work on misinformation.

4 replies on “A System to Study Anti-American Misinformation and Disinformation Efforts on Social Media”

Leave a Reply Cancel reply