Technology Ethics: Analysis of Twitter Hashtags

Author(s): Ghaaliyah Brown, Brett Strosnider

Mentor(s): Aditya Johri, Department of Information Sciences and Technology

Abstract

As artificial intelligence advances, discussions surrounding technology ethics are gaining prominence. Social media sites can provide valuable insight into these conversations. Over the period of one month, Twitter API was utilized to retrieve 6,230 original tweets using the keywords technology ethics, tech ethics, and ai ethics and the hashtags #aiethics, #ethicalai, and #responsibleai. Descriptive statistical analysis was performed in Python. Only 4% of the tweets originated from verified accounts. Although there was significant variation in the number of friends, followers, and favorited tweets associated with the source accounts, the majority had no friends, no followers, and no favorited tweets. The most common hashtags used in tandem with the search query terms were mostly composed of terms relating to policy and governance and methods through which artificial intelligence is implemented. Every fourth tweet of the sample was also selected for qualitative coding (N = 1,558). Five categories were established: ethical issues, industry/sector, content/media type, specific technologies discussed, and geographic location. The most common ethical issues discussed related to regulation, algorithm bias, and privacy. Ethical issues were most frequently discussed within the context of the technology industry, government entities, businesses, and healthcare. Social media platforms, robots, and biometric technologies were mentioned quite frequently, and the most popular geographic location mentioned was India. News articles comprised the majority of the media shared in tweets. Future investigations should consist of more rigorous statistical analysis as well as qualitative analysis of the contents of media shared in tweets.

Video Transcript

*Voiceover for video is by Ghaaliyah Brown. Each paragraph represents one slide in the presentation. Hi everyone, my name is Ghaaliyah Brown and I worked with Brett on this project where we did an analysis of Twitter hashtags related to AI ethics to understand the stakeholders in the conversation and the topics of special interest and concern. We collected tweets between May 26 and June 30th using the hashtags, #aiethics, #ethicalai, and #responsibleai. And we used the keywords, technology ethics, tech ethics and AI ethics as search query terms as well. We retreated to 27,655 tweets and we ended up using 6230 of those for our analysis and those were the original tweets. The first part of our project was conducting statistical analysis, so this is where we worked with any numerical data that we collected. We used the Pandas and Seaborn libraries to analyze different quantitative metrics in Python. First, we chose to look at the distribution of verified accounts versus non verified accounts. And we found that the majority of these tweets were originating from non-verified accounts. We then looked at some of the different metrics related to the accounts from which the tweets originate, and some metrics relating to the tweets themselves. So in this table, you see that the friends count followers count in tweets, favorited, count all were very variable, so the standard deviation is extremely high, meaning that the data wasn’t closely distributed around the mean. Actually, when we made histograms for these different categories we found that most of the values were actually 0. In terms of retweet count and favorite count we actually weren’t able to properly adjust for some variables, so we chose not to discuss these very much in our report, and this is something that we’d look more into in a future project. We also looked at the frequency of tweets per day in our data collection period, and what we found is that there tended to be less activity on the weekends and there was more activity on the weekdays, but because our data collection was so short, there isn’t much more we can say beyond that. We looked at the most common hashtags used in the original tweets and something I want to point out here is that the most popular terms would either related to policy and governance. So you see governance, policy, and politics on this list or they’re related to methods through which AI is implemented. So machine learning algorithms, deep learning those are the kind of terms that tended to appear on this list. Now moving into the second part of our project, this is where we qualitatively coded the text of the tweets themselves. We chose to code every fourth tweet from that original sample size, so we coded a total of 1558 tweets and we did some using five different categories. The first category was geographic location. So the majority of these tweets actually didn’t mention specific geographic location, and so that’s what you see with N/A, but one of the tweets did mention a geographic location and they tended to mention India. India was by far the majority, and after that the European Union/UK, and the United States, those two category were also close behind. In the category ethical issues, we found that the most popular ethical issue is definitely law and regulation, and this follows with the trend that we noticed in the hashtags the most popular hashtags. It’s not surprising here and then that we expect the law and regulation as the most popular ethical issue. Bias was also another issue that came up a lot. Transparency, privacy, those were also issues that were quite popular. It is specific technology category the most popular code that we saw in the text, in the tweets excuse me, was definitely social media platforms and after that biometrics was also quite commonly mentioned, as were robots. We also got a category about the industry & sector and so at this category was describing is the context in which ethical issues are being discussed, and so by far as you’d expect most tweets were discussing ethical issues within the context of the tech industry, but a lot of them also discussed in the context of government, which again not fitting with this trend that we’re definitely seeing here, and business and healthcare. Our final category was immediate types shared and here we saw that definitely most of the tweets were sharing news articles, but we also saw a lot of tweets that were more of a discussion format, so they usually weren’t sharing any meaningful type of media. They were just part of an ongoing conversation or their tweets sharing blog posts or tweets of an advertising nature. In future research, we would definitely want to conduct a more rigorous analysis of quantitative metrics that we used in our statistical analysis. We would also want to look at actually analyzing the content of the media shared in tweets, or investigating pools of tweets pertaining to either specific pieces of regulation or specific technologies. Thank you very much. Feel free to reach out to us if you have any questions.

One reply on “Technology Ethics: Analysis of Twitter Hashtags”

Leave a Reply