Content analysis of Persian/Farsi Tweets during COVID-19 pandemic in Iran using NLP


May 17, 2020

Cornell University

As COVID-19 has rapidly and widely spread in the United States and globally, this now pandemic is shaking up all aspects of daily life in all countries affected in an unforeseen manner. Economic activities have been disrupted globally on an unprecedented scale and governments are resorting to a number of policies and measures trying to manage primary and subsequent health, economic and financial aspects of this crisis. While each country will have its unique experience of dealing with this pandemic, there are shared aspects in terms of how different societies deal with and react to the spread of the virus. These commonalities stem from the nature of the virus itself and the biological and psychological similarities of humankind regardless of geographical boundaries. Additionally, there is the commonality in terms of policies that have been devised for mitigation and control of the virus across borders. Iran, along with China, South Korea, and Italy has been among the countries that have been hit hard in the first wave of the viral spread the cause of which is to be yet fully explained. Iranians have been using social media outlets such as Telegram, WhatsApp, Twitter, Instagram, and Facebook to both receive a large portion of their daily news, in addition to spreading the information to one another and expressing their opinion about various developments in the country such as social unrest or in this case events and issues related to the COVID-19 spread. Leveraging Machine Learning and Natural Language Processing (NLP) techniques we are conducting an ongoing analysis of the reaction of the Persian/Farsi speaking users1 on social media starting with Twitter. In this work, we applied topic modeling to find the themes of tweets posted in Persian/Farsi about COVID-19, followed by manual annotation of a random subset of tweets to asses the distribution of various content types among all tweets. We believe our framework can be valuable in monitoring public reaction to ongoing developments locally and internationally related to COVID-19 pandemic, but additionally, a tool and platform to be used for future major economical, political, or health-related events among Persian/Farsi users.

Read More