Machine Learning-Based Analysis of Anti-U.S. Perception on Mass Media – OSCAR Celebration of Student Scholarship and Impact

Author(s): Ekrem Kaya, Luke Palmieri

Mentor(s): Hamdi Kavak, Computational and Data Sciences

Abstract

Disinformation and misinformation can play a critical role in public opinion and impact foreign relations in a great variety of ways. Specifically, intentional disinformation or misinformation can be a reflection of ulterior motives. Analysis of social media and mass media at a national scale can reveal patterns of disinformation and misinformation. Machine Learning-Based methods such as LDA, sentiment analysis, and emotion analysis are tools that can aid in the analysis process of data of such great magnitude. A case study is conducted applying these principles to Turkey as a case study on social media(Twitter) and mass media(Various Turkish Mass Media Outlets).

Audio Transcript

Hello. My name is Ekrem Kaya, and I am a freshman majoring in Computational and Data Sciences. Luke Palmieri and I currently work with Dr. Hamdi Kavak as part of this research project funded by the Commonwealth Cyber Initiative. Other partner institutions in this project are Old Dominion University and Tidewater Community College. The goal of this project is to investigate the mass media and social media effect in creating anti-U.S. perceptions through misinformation and disinformation efforts within the U.S. allies. The target country for the case study is Turkey, and the actors are Russia and the USA. The social media we are analyzing is Twitter, and the mass media we are analyzing are Turkey’s local and international media outlets publishing in Turkish. This presentation summarizes our efforts in the last three months.

I am part of the team as an Undergraduate Research Assistant along with Luke Palmieri and we are working on the mass media aspect of the project. We collect news articles from various Turkish mass media outlets using the Factiva database. The articles are filtered using specific Turkish keywords related to the U.S. and downloaded. The full text is later translated into English and stored into an SQLite database along with important metadata, including the title, date of publication, and the subject of interest. So far, we have collected nearly 40,000 articles ranging from January to December of 2021. It is also important to note that this is a novel collection that is yet to be analyzed.

Right after the collection of the news articles, we first translated them into English through an automated process our group had developed. This allows us to use any existing natural language processing techniques available for English. We then apply sentiment analysis techniques to identify the stance for or against the U.S. Sentiment analysis is the process of identifying whether the sentiment of an article is positive, negative, or neutral. Each article is assigned a score depicting its sentiment score. We used various Python packages to conduct sentiment analysis, such as Afinn, Textblob and Vader, which all give the sentiment score of the articles. We settled on Afinn as our final and most effective package to conduct sentiment analysis on the collected articles.

We also plan on conducting emotion analysis as well as Latent Dirichlet Allocation (LDA), which is a topic modeling technique. In short, LDA can uncover latent connections between a set of documents and show which are closer to each other. That will allow us to further filter relevant news articles.

This is the graph of the average sentiment score for the various outlets in 2021. We can see multiple positive and negative extremes throughout the year. For example, we can see that in April, there was an increase in negative sentiment. We also know that during those periods of extreme negative sentiment, President Biden recognized the Armenian Genocide.

For now, we are looking to identify more connections between political events between Turkey and the U.S. and how these events are reflected in mass media. Our future steps are data collection from 2015 and onward and performing sentiment and emotion analysis of all articles. Our long-term goal is to become a research hub for compiling data about anti-American views regarding disinformation and misinformation in Turkey and possibly other countries. Thank you for listening, and please drop your questions below.

3 replies on “Machine Learning-Based Analysis of Anti-U.S. Perception on Mass Media”

Hi Ekrem! What an interesting topic! Will you continue this research in the Spring? Did the language skills of the researchers themselves impact the results in any way?

Good Morning! I will be continuing this project in the Spring along with my team. Having native Turkish speakers on the team definitely helped us through a few of the issues we had.

What an interesting way to use technology to understand communication via mass media. I really enjoyed the presentation. Do you plan to collect more articles from Turkey or to move to a different country?

3 replies on “Machine Learning-Based Analysis of Anti-U.S. Perception on Mass Media”

Leave a Reply Cancel reply