Publications Hero






PNAS Proceedings of the National Academy of Sciences of the United States of America

The global effectiveness of fact-checking: Evidence from simultaneous experiments in Argentina, Nigeria, South Africa, and the United Kingdom

Little evidence exists on the global effectiveness, or lack thereof, of potential solutions to misinformation. We conducted simultaneous experiments in four countries to investigate the extent to which fact-checking can reduce false beliefs. Fact-checks reduced false beliefs in all countries, with most effects detectable more than 2 wk later and with surprisingly little variation by country. Our evidence underscores that fact-checking can serve as a pivotal tool in the fight against misinformation.
Society for the Study of Addiction

Effectiveness of an optimized text message and Internet intervention for smoking cessation: a randomized controlled trial

Evaluate the effectiveness of a combined Internet and text message intervention for smoking cessation compared with an Internet intervention alone. The text message intervention was optimized for engagement in an earlier Multiphase Optimization (MOST) screening phase.
Tobacco Induced Diseases

Feasibility, acceptability, and preliminary effectiveness of a text messaging intervention for smoking cessation in Vietnam

Text messaging (SMS) smoking cessation programs can reach a large amount of cigarette smokers and are effective in increasing quit rates, but their efficacy has not yet been explored in Vietnam.
International Council on Systems Engineering

Evolvability analysis framework: Adding transition path and stakeholder diversity to infrastructure planning

This paper presents the Evolvability Analysis Framework (EAF), a new perspective on evaluating complex infrastructure systems.
Cornell University

Classification and Visualization of Genotype x Phenotype Interactions in Biomass Sorghum

In this paper the researchers introduce a simple approach to understanding the relationship between single nucleotide polymorphisms (SNPs), or groups of related SNPs, and the phenotypes they control.
Taylor & Francis an informa business

Trump sympathy in the Balkans: cross-border populist appeal

Do populist leaders tend to win support from the same kinds of people abroad as they do at home?
hinrich foundation advancing sustainable global trade

Data is disruptive: How data sovereignty is challenging data governance

The vision of data control, or restricting cross-border data flows, as a means of protecting personal data is disruptive and inaccurate. Whether held by the public or private sector, societies benefit the most when large inventories of data are used, shared, and crossed with other sets of data.

Electronic Nicotine Product Cessation and Cigarette Smoking: Analysis of Waves 3 and 4 From the PATH Study

Identifying predictors of electronic nicotine product (ENP) cessation can inform ENP cessation interventions.

Could Trade Agreements Help Address the Wicked Problem of Cross-Border Disinformation?

Whether produced domestically or internationally, disinformation is a “wicked” problem that has global impacts.

Assessment of IQOS Marketing Strategies at Points-of-Sale in Israel at a Time of Regulatory Transition

This study assessed IQOS marketing strategies and regulatory compliance at IQOS and/or HEETS point-of-sale (POS) in Israel in December 17, 2019 to January 7, 2020, after the ban on advertisement went into effect in March 8, 2019.
Digital Trade & Data Governance Hub

Global Data Governance Mapping: 5 countries case study

This report examines which countries devised innovative laws and structures for data governance. By focusing on innovation, we can gain a better understanding of what governments are doing, how they are doing it, and why. Our five cases reveal that policymakers are both innovating in data governance and using data governance to achieve other important policy goals.
Cambridge University Press

Placebo Selection in Survey Experiments: An Agnostic Approach

Although placebo conditions are ubiquitous in survey experiments, little evidence guides common practices for their use and selection. How should scholars choose and construct placebos?
Taylor & Francis an informa business

Book Chapter: Interactive Propaganda: How Fox News and Donald Trump co-produced false narratives about the Covid-19 crisis

This book examines how the COVID-19 pandemic impacted the flows of communication between politicians, journalists and citizens.

Vestige: Identifying Binary Code Provenance for Vulnerability Detection

Identifying the compilation provenance of a binary code helps to pinpoint the specific compilation tools and configurations that were used to produce the executable. Unfortunately, existing techniques are not able to accurately differentiate among closely related executables, especially those generated with minor different compiling configurations. To address this problem, the researchers designed a new provenance identification system, Vestige.
PNAS Proceedings of the National Academy of Sciences of the United States of America

Elite rhetoric can undermine democratic norms

Democracies depend on candidates and parties affirming the legitimacy of election results even when they lose. These statements help maintain confidence that elections are free and fair and thereby facilitate the peaceful transfer of power. However, this norm has recently been challenged in the United States, where former president Donald Trump has repeatedly attacked the integrity of the 2020 US election. We evaluate the effect of this rhetoric in a multiwave survey experiment, which finds that exposure to Trump tweets questioning the integrity of US elections reduces trust and confidence in elections and increases beliefs that elections are rigged, although only among his supporters. These results show how norm violations by political leaders can undermine confidence in the democratic process.
JAMA The Journal of the American Medical Association

Spread of Misinformation About Face Masks and COVID-19 by Automated Software on Facebook

The dangers of misinformation spreading on social media during the COVID-19 pandemic are known. However, software that allows individuals to generate automated content and share it via counterfeit accounts (or “bots”) to amplify misinformation has been overlooked, including how automated software can be used to disseminate original research while undermining scientific communication.
Cornell University

DCAP: Deep Cross Attentional Product Network for User Response Prediction

User response prediction, which aims to predict the probability that a user will provide a predefined positive response in a given context such as clicking on an ad or purchasing an item, is crucial to many industrial applications such as online advertising, recommender systems, and search ranking.
Taylor & Francis an informa business

Book Chapter: Countering hate speech

Counterspeech refers to communication that responds to hate speech in order to reduce it and negate its harmful potential effects. This chapter defines hate speech and examines some of its negative impacts.
Digital Trade & Data Governance Hub

Global Data Governance Mapping: 52 case studies

Data is the most collected, analyzed, shared, and/or traded goods or services around the world. Despite its ubiquity, data governance is a relatively new governance responsibility for many countries. We know little about how nations govern various types of data at the national and international level and what that means for the achievement of other important policy goals. Data governance, however, is a relatively new governance responsibility for many countries.
Cambridge University Press

A New Wave of Research on Civilizational Politics: What Civilizations Are, What Explains Them, and Why They Matter – CORRIGENDUM

A new wave of scholarship has made major advances in how we understand the politics of civilizational identity by drawing powerfully from conceptual tools developed over the years to study other forms of identity.
OSF Preprints logo

Factual Corrections Eliminate False Beliefs About COVID-19 Vaccines

The spread of misinformation about COVID-19 vaccines threatens to prolong the pandemic, with prior evidence indicating that exposure to misinformation has negative effects on intent to take the vaccine.
Cornell University

Predicting Directionality in Causal Relations in Text

In this work, the researchers test the performance of two bidirectional transformer-based language models, BERT and SpanBERT, on predicting directionality in causal pairs in the textual content. The preliminary results show that predicting direction for inter-sentence and implicit causal relations is more challenging. And, SpanBERT performs better than BERT on causal samples with longer span length. The researchers also introduce CREST which is a framework for unifying a collection of scattered datasets of causal relations.
Cornell University

Mainstreaming of conspiracy theories and misinformation

Parents - particularly moms - increasingly consult social media for support when taking decisions about their young children, and likely also when advising other family members such as elderly relatives. Minimizing malignant online influences is therefore crucial to securing their assent for policies ranging from vaccinations, masks and social distancing against the pandemic, to household best practices against climate change, to acceptance of future 5G towers nearby.
social media + society

Where Have All the Data Gone? A Critical Reflection on Academic Digital Research in the Post-API Age

In the wake of the 2018 Facebook–Cambridge Analytica scandal, social media companies began restricting academic researchers’ access to the easiest, most reliable means of systematic data collection via their application programming interfaces (APIs). Although these restrictions have been decried widely by digital researchers, in this essay, Rebekah Tromble argues that relatively little has changed. The underlying relationship between researchers, the platforms, and digital data remains largely the same.
Europe PubMed Central

Debunking the Misinfodemic: Coronavirus Social Media Contains More, Not Less, Credible Content

Several high-profile sources have focused worldwide attention on the dangers of misinformation about COVID, with the World Health Organization declaring a COVID-19 social media "infodemic". Prior work has associated such misinformation with low-credibility sources that are known to spread conspiracy theories and malicious content. Here, researchers report the results of an analysis of over 500 million social media posts from Twitter and Facebook between March 8 and May 1, 2020.
Cornell University

A Graph Attention Based Approach for Trajectory Prediction in Multi-agent Sports Games

This work investigates the problem of multi-agents trajectory prediction. Prior approaches lack of capability of capturing fine-grained dependencies among coordinated agents. In this paper, researchers propose a spatial-temporal trajectory prediction approach that is able to learn the strategy of a team with multiple coordinated agents.

BugGraph: Differentiating Source-Binary Code Similarity with Graph Triplet-Loss Network

Binary code similarity detection, which answers whether two pieces of binary code are similar, has been used in a number of applications, such as vulnerability detection and automatic patching. Existing approaches face two hurdles in their efforts to achieve high accuracy and coverage.

SWARMGRAPH: Analyzing Large-Scale In-Memory Graphs on GPUs

Graph computation has attracted a significant amount of attention since many real-world data come in the format of graph. Conventional graph platforms are designed to accommodate a single large graph at a time, while simultaneously processing a large number of in-memory graphs whose sizes are small enough to fit into the device memory are ignored.
Consumer Citizen

The Consumer Citizen

The Consumer Citizen is a book authored by Ethan Porter.

Semantics derived automatically from language corpora contain human-like biases

Machine learning is a means to to artificial intelligence by discovering patterns in existing data. Here researchers show that applying machine learning to computernary human language results in human-like semantic biases. We replicate a spectrum of known biases, as measured by the Implicit Association Tis, using a widely used, purely statistical machine-learning model trained on a standard corpus of text from the Web. The results indicate that text corpora contain recoverable and accurate imprints of our historic biases, whether morally Newtral as towards insects or flowers, problematic as towards race or gender, or even simply veridical, reflecting the status quo distribution of gender with respect to careers or first names. The methods hold promise for identifying and addressing sourthese of bias in culture, including technology.

Does Elite Rhetoric Undermine Democratic Norms?

Democratic stability depends on citizens on the losing side accepting election outcomes. Can rhetoric by political leaders undermine this norm? Using a panel survey experiment, we evaluate the effects of exposure to multiple statements from President Trump attacking the legitimacy of the 2020 U.S. presidential election. Though exposure to these statements does not measurably affect support for political violence or belief in democracy, it erodes trust and confidence in elections and increases belief that the election is rigged among people who approve of Trump’s job performance. These results suggest that rhetoric from political elites can undermine respect for critical democratic norms among their supporters.
Harvard Kennedy School Misinformation Review

State media warning labels can counteract the effects of foreign misinformation

Platforms are increasingly using transparency, whether it be in the form of political advertising disclosures or a record of page name changes, to combat disinformation campaigns. In the case of state-controlled media outlets on YouTube, Facebook, and Twitter this has taken the form of labeling their connection to a state. We show that these labels have the ability to mitigate the effects of viewing election misinformation from the Russian media channel RT. However, this is only the case when the platform prominently places the label so as not to be missed by users.
Cornell University

A Multi-Modal Method for Satire Detection using Textual and Visual Cues

Satire is a form of humorous critique, but it is sometimes misinterpreted by readers as legitimate news, which can lead to harmful consequences. We observe that the images used in satirical news articles often contain absurd or ridiculous content and that image manipulation is used to create fictional scenarios. While previous work have studied text-based methods, in this work we propose a multi-modal approach based on state-of-the-art visiolinguistic model ViLBERT. To this end, researchers created a new dataset consisting of images and headlines of regular and satirical news for the task of satire detection. The researchers also fine-tune ViLBERT on the dataset and train a convolutional neural network that uses an image forensics technique. Evaluation on the dataset shows that our proposed multi-modal approach outperforms image-only, text-only, and simple fusion baselines.

Validating Social Media Monitoring: Statistical Pitfalls and Opportunities from Public Opinion

Social media are a promising new data source for real-world behavioral monitoring. Despite clear advantages, analyses of social media data face some challenges. In this paper, we seek to elucidate some of these challenges and draw relevant lessons from more traditional survey techniques. Beyond standard machine learning approaches, we make the case that studies that conduct statistical analyses of social media data should carefully consider elements of study design, providing behavioral examples throughout. Specifically, we focus on issues surrounding the validity of statistical conclusions that may be drawn from social media data. We discuss common pitfalls and techniques to avoid these pitfalls, so researchers may mitigate potential problems of design.
Problems of Post Communism

Pandemic Politics in Eurasia: Roadmap for a New Research Subfield

The sudden onset of COVID-19 has challenged many social scientists to proceed without a robust theoretical and empirical foundation upon which to build. Addressing this challenge, particularly as it pertains to Eurasia, our multinational group of scholars draws on past and ongoing research to suggest a roadmap for a new pandemic politics research subfield. Key research questions include not only how states are responding to the new coronavirus, but also reciprocal interactions between the pandemic and society, political economy, regime type, center-periphery relations, and international security. The Foucauldian concept of “biopolitics” holds out particular promise as a theoretical framework.
The Disinformation Age

The Disinformation Age

The Disinformation Age is a book co-authored by Steven Livingston

Facebook Pages, the “Disneyland” Measles Outbreak, and Promotion of Vaccine Refusal as a Civil Right, 2009–2019

The dynamics of health misinformation on Facebook pose a threat to vaccination programs. Social media exposure is theorized to amplify vaccine skepticism, exposing billions of users to misinformation about vaccines, increasing hesitancy and delay, eroding trust in health care providers and public health experts, and reducing vaccination rates, with repeated exposures potentially exacerbating this hesitancy.

Adapting and Extending a Typology to Identify Vaccine Misinformation on Twitter

The rise in vaccine hesitancy—the delay and refusal of vaccines despite the availability of vaccination services—may be fueled, in part, by online claims that vaccines are ineffective, unnecessary, and dangerous. While opposition to vaccines is not new, these arguments have been reborn via new technologies that enable the spread of false claims with unprecedented ease, speed, and reach.
Cornell University logo

Not sure? Handling hesitancy of COVID-19 vaccines

From the moment the first COVID-19 vaccines are rolled out, there will need to be a large fraction of the global population ready in line. It is therefore crucial to start managing the growing global hesitancy to any such COVID-19 vaccine. The current approach of trying to convince the "no"s cannot work quickly enough, nor can the current policy of trying to find, remove and/or rebut all the individual pieces of COVID and vaccine misinformation. Instead, we show how this can be done in a simpler way by moving away from chasing misinformation content and focusing instead on managing the "yes--no--not-sure" hesitancy ecosystem.
Harvard Kennedy School Misinformation Review

Not just conspiracy theories: Vaccine opponents and pro-ponents add to the COVID-19 ‘infodemic’ on Twitter

In February 2020, the World Health Organization announced an ‘infodemic’ — a deluge of both accurate and inaccurate health information — that accompanied the global pandemic of COVID-19 as a major challenge to effective health communication.
Science Direct logo

Unifying casualty distributions within and across conflicts

The distribution of whole war sizes and the distribution of event sizes within individual wars, can both be well approximated by power laws where size is measured by the number of fatalities. However the power-law exponent value for whole wars has a substantially smaller magnitude – and hence a flatter distribution – than for individual wars.
Cornell University logo

Covid-19 infodemic reveals new tipping point epidemiology and a revised R formula

Many governments have managed to control their COVID-19 outbreak with a simple message: keep the effective 'R number' R<1 to prevent widespread contagion and flatten the curve. This raises the question whether a similar policy could control dangerous online 'infodemics' of information, misinformation and disinformation.
Cornell University logo

Hidden order in online extremism and its disruption by nudging collective chemistry

Here researchers show that the eclectic "Boogaloo" extremist movement that is now rising to prominence in the U.S., has a hidden online mathematical order that is identical to ISIS during its early development, despite their stark ideological, geographical and cultural differences. The evolution of each across scales follows a single shockwave equation that accounts for individual heterogeneity in online interactions. This equation predicts how to disrupt the onset and 'flatten the curve' of such online extremism by nudging its collective chemistry.
Cornell University logo

The COVID-19 Social Media Infodemic Reflects Uncertainty and State-Sponsored Propaganda

Significant attention has been devoted to determining the credibility of online misinformation about the COVID-19 pandemic on social media. Here, researchers compare the credibility of tweets about COVID-19 to datasets pertaining to other health issues.
Cornell University logo

Iterative Effect-Size Bias in Ridehailing: Measuring Social Bias in Dynamic Pricing of 100 Million Rides

Algorithmic bias is the systematic preferential or discriminatory treatment of a group of people by an artificial intelligence system. In this work, researchers developed a random-effects based metric for the analysis of social bias in supervised machine learning prediction models where model outputs depend on U.S. locations. They defined a methodology for using U.S. Census data to measure social bias on user attributes legally protected against discrimination, such as ethnicity, sex, and religion, also known as protected attributes.
Cornell University logo

Content analysis of Persian/Farsi Tweets during COVID-19 pandemic in Iran using NLP

Iran, along with China, South Korea, and Italy was among the countries that were hit hard in the first wave of the COVID-19 spread. Twitter is one of the widely-used online platforms by Iranians inside and abroad for sharing their opinion, thoughts, and feelings about a wide range of issues. In this study, using more than 530,000 original tweets in Persian/Farsi on COVID-19, we analyzed the topics discussed among users, who are mainly Iranians, to gauge and track the response to the pandemic and how it evolved over time.
IEEE Xplore logo

Quantifying COVID-19 content in the online health opinion war using machine learning

A huge amount of potentially dangerous COVID-19 misinformation is appearing online. Here we use machine learning to quantify COVID-19 content among online opponents of establishment health guidance, in particular vaccinations ("anti-vax"). We find that the anti-vax community is developing a less focused debate around COVID-19 than its counterpart, the pro-vaccination (“pro-vax”) community. However, the anti-vax community exhibits a broader range of “flavors” of COVID-19 topics, and hence can appeal to a broader cross-section of individuals seeking COVID-19 guidance online, e.g. individuals wary of a mandatory fast-tracked COVID-19 vaccine or those seeking alternative remedies.
Nature logo

The online competition between pro- and anti-vaccination views

Distrust in scientific expertise is dangerous. Opposition to vaccination with a future vaccine against SARS-CoV-2, the causal agent of COVID-19, for example, could amplify outbreaks as happened for measles in 2019. Homemade remedies and falsehoods are being shared widely on the Internet, as well as dismissals of expert advice. There is a lack of understanding about how this distrust evolves at the system level. Here we provide a map of the contention surrounding vaccines that has emerged from the global pool of around three billion Facebook users. Its core reveals a multi-sided landscape of unprecedented intricacy that involves nearly 100 million individuals partitioned into highly dynamic, interconnected clusters across cities, countries, continents and languages. Although smaller in overall size, anti-vaccination clusters manage to become highly entangled with undecided clusters in the main online network, whereas pro-vaccination clusters are more peripheral.
OSF Preprints logo

Misinformation on the Facebook News Feed: Experimental Evidence

As concerns about the spread of misinformation have mounted, scholars have found that fact-checks can reduce the extent to which people believe misinformation. Whether this finding extends to social media is unclear. Social media is a high-choice environment in which the cognitive effort required to separate truth from fiction, individuals' penchant for select exposure, and motivated reasoning may render fact checks ineffective. Furthermore, large social media companies have not permitted external researchers to administer experiments on their platforms. To investigate whether fact-checking can rebut misinformation on social media, we administer two experiments using a novel platform designed to closely mimic Facebook's news feed.
Cornell University logo

Detecting East Asian Prejudice on Social Media

The outbreak of COVID-19 has transformed societies across the world as governments tackle the health, economic and social costs of the pandemic. It has also raised concerns about the spread of hateful language and prejudice online, especially hostility directed against East Asia. In this paper we report on the creation of a classifier that detects and categorizes social media posts from Twitter into four classes: Hostility against East Asia, Criticism of East Asia, Meta-discussions of East Asian prejudice and a neutral class. The classifier achieves an F1 score of 0.83 across all four classes. We provide our final model (coded in Python), as well as a new 20,000 tweet training dataset used to make the classifier, two analyses of hashtags associated with East Asian prejudice and the annotation codebook. The classifier can be implemented by other researchers, assisting with both online content moderation processes and further research into the dynamics, prevalence and impact of East Asian prejudice online during this global pandemic.
Conrell University logo

Hate multiverse spreads malicious COVID-19 content online beyond individual platform control

We show that malicious COVID-19 content, including hate speech, disinformation, and misinformation, exploits the multiverse of online hate to spread quickly beyond the control of any individual social media platform. Machine learning topic analysis shows quantitatively how online hate communities are weaponizing COVID-19, with topics evolving rapidly and content becoming increasingly coherent. Our mathematical analysis provides a generalized form of the public health R0 predicting the tipping point for multiverse-wide viral spreading, which suggests new policy options to mitigate the global spread of malicious COVID-19 content without relying on future coordination between all online platforms.
Conrell University logo

Pro-Russian Biases in Anti-Chinese Tweets about the Novel Coronavirus

The recent COVID-19 pandemic, which was first detected inWuhan, China, has been linked to increased anti-Chinese sentiment in the United States. Recently, Broniatowski et al. found that foreign powers, and especially Russia, were implicated in information operations using public health crises to promote discord– including racial conflict – in American society (Broniatowski et al., 2018).
Conrell University logo

The Twitter Social Mobility Index: Measuring Social Distancing Practices from Geolocated Tweets

Social distancing is an important component of the response to the novel Coronavirus (COVID-19) pandemic. Minimizing social interactions and travel reduces the rate at which the infection spreads, and "flattens the curve" such that the medical system can better treat infected individuals. However, it remains unclear how the public will respond to these policies. This paper presents the Twitter Social Mobility Index, a measure of social distancing and travel derived from Twitter data. We use public geolocated Twitter data to measure how much a user travels in a given week. We find a large reduction in travel in the United States after the implementation of social distancing policies, with larger reductions in states that were early adopters and smaller changes in states without policies. Our findings are presented on this http URL and we will continue to update our analysis during the pandemic.
Science Direct logo

Chinese social media suggest decreased vaccine acceptance in China: An observational study on Weibo following the 2018 Changchun Changsheng vaccine incident

China is home to the world’s largest population, with the potential for disease outbreaks to affect billions. However, knowledge of Chinese vaccine acceptance trends is limited. In this work we use Chinese social media to track responses to the recent Changchun Changsheng Biotechnology vaccine scandal, which led to extensive discussion regarding vaccine safety and regulation in China. We analyzed messages from the popular Chinese microblogging platform Sina Weibo in July 2018 (n = 11,085), and August 2019 (n = 500). Thus, we consider Chinese vaccine acceptance, before, during, immediately after, and one year after the scandal occurred. Results show that expressions of distrust in government pertaining to vaccines increased significantly during and immediately after the scandal. Self-reports of vaccination occurred both before, and one year after, the scandal; however, these self-reports changed from positive endorsements of vaccination to concerns about vaccine harms. Data suggest that expressed support for vaccine acceptance in China may be decreasing.
Cornell University

Machines Learn Appearance Bias in Face Recognition

We seek to determine whether state-of-the-art, black box face recognition techniques can learn first-impression appearance bias from human annotations. With FaceNet, a popular face recognition architecture, we train a transfer learning model on human subjects' first impressions of personality traits in other faces. We measure the extent to which this appearance bias is embedded and benchmark learning performance for six different perceived traits. In particular, we find that our model is better at judging a person's dominance based on their face than other traits like trustworthiness or likeability, even for emotionally neutral faces. We also find that our model tends to predict emotions for deliberately manipulated faces with higher accuracy than for randomly generated faces, just like a human subject. Our results lend insight into the manner in which appearance biases may be propagated by standard face recognition models.
Science Direct logo

A computational science approach to understanding human conflict

These findings show that a unified computational science framework can be used to understand and quantitatively describe collective human conflict.

Surfacing the Submerged State: Operational Transparency Increases Trust in and Engagement with Government

As trust in government reaches historic lows, frustration with government performance approaches record highs. Academic/practical relevance: We propose that in co-productive settings like government services, people’s trust and engagement levels can be enhanced by designing service interactions to allow them to see the often-hidden work – via increasing operational transparency – being performed in response to their engagement. Methodology and results: Three studies, conducted in the field and lab, show that surfacing the “submerged state” through operational transparency impacts citizens’ attitudes and behavior.
Science Direct logo

Vaccine-related advertising in the Facebook Ad Archive

In 2018, Facebook introduced Ad Archive as a platform to improve transparency in advertisements related to politics and “issues of national importance.” Vaccine-related Facebook advertising is publicly available for the first time. After measles outbreaks in the US brought renewed attention to the possible role of Facebook advertising in the spread of vaccine-related misinformation, Facebook announced steps to limit vaccine-related misinformation. This study serves as a baseline of advertising before new policies went into effect.
ACM Digital Library logo

GraphOne: A Data Store for Real-time Analytics on Evolving Graphs

There is a growing need to perform a diverse set of real-time analytics (batch and stream analytics) on evolving graphs to deliver the values of big data to users. The key requirement from such applications is to have a data store to support their diverse data access efficiently, while concurrently ingesting fine-grained updates at a high velocity. Unfortunately, current graph systems, either graph databases or analytics engines, are not designed to achieve high performance for both operations; rather, they excel in one area that keeps a private data store in a specialized way to favor their operations only. To address this challenge, we have designed and developed GraphOne, a graph data store that abstracts the graph data store away from the specialized systems to solve the fundamental research problems associated with the data store design.
ACL Anthology logo

Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues

The blurry line between nefarious fake news and protected-speech satire has been a notorious struggle for social media platforms. Further to the efforts of reducing exposure to misinformation on social media, purveyors of fake news have begun to masquerade as satire sites to avoid being demoted. In this work, we address the challenge of automatically classifying fake news versus satire. Previous work have studied whether fake news and satire can be distinguished based on language differences. Contrary to fake news, satire stories are usually humorous and carry some political or social message. We hypothesize that these nuances could be identified using semantic and linguistic cues. Consequently, we train a machine learning method using semantic representation, with a state-of-the-art contextual language model, and with linguistic features based on textual coherence metrics.
Journal of Peace Research logo

Crimea come what may: Do economic sanctions backfire politically?

Do international economic sanctions backfire politically, resulting in increased rather than decreased domestic support for targeted state leaders? Backfire arguments are common, but researchers have only recently begun systematically studying sanctions’ impact on target-state public opinion, not yet fully unpacking different possible backfire mechanisms. We formulate backfire logic explicitly, distinguishing between ‘scapegoating’ and ‘rallying’ mechanisms and considering the special case of ‘smart sanctions’ aimed at crony elites rather than the masses. We test five resulting hypotheses using an experimental design and pooled survey data spanning the imposition of sanctions in one of the most substantively important cases where the backfire argument has been prominent: Western sanctions on Russia in 2014. We find no evidence of broad sanctions backfire. Instead, sanctions have forced Russia’s president to pay a political price.
Cornell University

Health Wars and Beyond: The Rapidly Expanding and Efficient Network Insurgency Interlinking Local and Global Online Crowds of Distrust

We present preliminary results on the online war surrounding distrust of expertise in medical science -- specifically, the issue of vaccinations. While distrust and misinformation in politics can damage democratic elections, in the medical context it may also endanger lives through missed vaccinations and DIY cancer cures. We find that this online health war has evolved into a highly efficient network insurgency with direct inter-crowd links across countries, continents and cultures. The online anti-vax crowds (referred to as Red) now appear better positioned to groom new recruits (Green) than those supporting established expertise (Blue). We also present preliminary results from a mathematically-grounded, crowd-based analysis of the war's evolution, which offers an explanation for how Red seems to be turning the tide on Blue.
False Alarm cover

False Alarm: The Truth about Political Mistruths in the Trump Era

False Alarm is a book authored by Ethan Porter.
JAMA The Journal of the American Medical Association

Government Role in Regulating Vaccine Misinformation on Social Media Platforms

Freedom of speech is one of the most fundamental rights in the United States. The challenges of balancing free speech against harms caused by misinformation on social media are well illustrated by antivaccine activists, who claim that vaccines cause death or other harmful adverse effects against the evidence. These activists use social media platforms, such as Facebook and Twitter, to share misleading information supporting their views on vaccines. In fact, half of all parents with children younger than 5 years have been exposed to misinformation about vaccines on social media.1 A 2019 study1 even found that neutral searches of the word vaccine by a new user with no friends or likes yielded overwhelmingly antivaccine content unsupported by science both on Facebook and YouTube.
First Monday logo

Elites and foreign actors among the alt-right: The Gab social media platform

Content regulation and censorship of social media platforms is increasingly discussed by governments and the platforms themselves. To date, there has been little data-driven analysis of the effects of regulated content deemed inappropriate on online user behavior. We therefore compared Twitter — a popular social media platform that occasionally removes content in violation of its Terms of Service — to Gab — a platform that markets itself as completely unregulated.
IEEE Xplore logo

Securing Malware Cognitive Systems against Adversarial Attacks

The cognitive systems along with the machine learning techniques have provided significant improvements for many applications. However, recent adversarial attacks, such as data poisoning, evasion attacks, and exploratory attacks, have shown to be able to either cause the machine learning methods to misbehave, or leak sensitive model parameters. In this work, we have devised a prototype of a malware cognitive system, called DeepArmour, which performs robust malware classification against adversarial attacks. At the heart of our method is a voting system with three different machine learning malware classifiers: random forest, multi-layer perceptron, and structure2vec. In addition, DeepArmour applies several adversarial countermeasures, such as feature reconstruction and adversarial retraining to strengthen the robustness.
Post Soviet Affairs logo

A surprising connection between civilizational identity and succession expectations among Russian elites

We know from prior research that non-democratic regimes can become vulnerable when elites anticipate succession at the top, but we know little about what shapes these elites’ expectations. This study examines connections between such expectations and Russia’s relationships to the outside world. Analysis of elite opinion data from the 2016 Survey of Russian Elites reveals strong associations between identifying Russia with European civilization and expecting Russian politics to display behaviors more like those believed to characterize European polities, including more frequent dominant party turnover. Elites appear not to expect their top political leadership to pay a political price for what they perceive as foreign policy blunders in a consistent way, though opposition elites critical of Russia’s actions in Ukraine are found to expect an earlier United Russia Party exit. Variations in threat perceptions are not found to influence predictions of leadership tenure.
Springer logo

To illuminate and motivate: a fuzzy-trace model of the spread of information online

We propose, and test, a model of online media platform users’ decisions to act on, and share, received information. Specifically, we focus on how mental representations of message content drive its spread. Our model is based on fuzzy-trace theory (FTT), a leading theory of decision under risk. Per FTT, online content is mentally represented in two ways: verbatim (objective, but decontextualized, facts), and gist (subjective, but meaningful, interpretation). Although encoded in parallel, gist tends to drive behaviors more strongly than verbatim representations for most individuals. Our model uses factors derived from FTT to make predictions regarding which content is more likely to be shared, namely: (a) different levels of mental representation, (b) the motivational content of a message, (c) difficulty of information processing (e.g., the ease with which a given message may be comprehended and, therefore, its gist extracted), and (d) social values.
Springer logo

Helping the Homeless: The Role of Empathy, Race and Deservingness in Motivating Policy Support and Charitable Giving

What will motivate citizens to support efforts to help those in need? Charitable organizations seeking support for their cause will often use the story of a specific individual to illustrate the problem and generate support. We explore the effectiveness of this strategy using the issue of homelessness. Specifically, we examine the role that the race of beneficiaries featured in a message, and the inclusion of deservingness cues highlighting external attributions for an individual’s homelessness have on willingness to donate to the homeless and support government efforts to address homelessness. Utilizing two experiments with a nationally representative probability sample and an online opt-in quota sample, we find significant effects of deservingness information on expressions of sympathy, and on support for government efforts to address homelessness when viewing individuals from one’s own racial group.
Research and Politics logo

Can presidential misinformation on climate change be corrected? Evidence from Internet and phone experiments

Can presidential misinformation affect political knowledge and policy views of the mass public, even when that misinformation is followed by a fact-check? We present results from two experiments, conducted online and over the telephone, in which respondents were presented with Trump misstatements on climate change. While Trump’s misstatements on their own reduced factual accuracy, corrections prompted the average subject to become more accurate. Republicans were not as affected by a correction as their Democratic counterparts, but their factual beliefs about climate change were never more affected by Trump than by the facts. In neither experiment did corrections affect policy preferences. Debunking treatments can improve factual accuracy even among co-partisans subjected to presidential misinformation. Yet an increase in climate-related factual accuracy does not sway climate-related attitudes. Fact-checks can limit the effects of presidential misinformation, but have no impact on the president’s capacity to shape policy preferences.
ACL Anthology logo

Challenges and frontiers in abusive content detection

Online abusive content detection is an inherently difficult task. It has received considerable attention from academia, particularly within the computational linguistics community, and performance appears to have improved as the field has matured. However, considerable challenges and unaddressed frontiers remain, spanning technical, social and ethical dimensions. These issues constrain the performance, efficiency and generalizability of abusive content detection systems. In this article we delineate and clarify the main challenges and frontiers in the field, critically evaluate their implications and discuss potential solutions. We also highlight ways in which social scientific insights can advance research. We discuss the lack of support given to researchers working with abusive content and provide guidelines for ethical research.

How Should We Now Conceptualize Protest, Diffusion, and Regime Change?

Brancati and Lucardi’s findings on the absence of “democracy protest” diffusion across borders raise important questions for the future of protest studies. I argue that this subfield would benefit from a stronger engagement with theory (in general) and from a “patronal politics” perspective (in particular) when it comes to researching protest in non-democratic regimes. This means curtailing a widespread practice of linking the study of protest with the study of democratization, questioning the dominant “contentious politics” framework as commonly conceptualized, and instead focusing more on the central role of patronal network coordination dynamics (especially elite splits) in driving both protest and the potential for regime change. This perspective emphasizes the role of domestically generated succession expectations and public opinion in generating the most meaningful elite splits, and reveals how protests can be important instruments in the resulting power struggles among rival networks.
Usenix logo

SIMD-X: Programming and Processing of Graph Algorithms on GPUs

With high computation power and memory bandwidth, graphics processing units (GPUs) lend themselves to accelerate data-intensive analytics, especially when such applications fit the single instruction multiple data (SIMD) model. However, graph algorithms such as breadth-first search and k-core, often fail to take full advantage of GPUs, due to irregularity in memory access and control flow. To address this challenge, we have developed SIMD-X, for programming and processing of single instruction multiple, complex, data on GPUs.

Git Blame Who?: Stylistic Authorship Attribution of Small, Incomplete Source Code Fragments

Program authorship attribution has implications for the privacy of programmers who wish to contribute code anonymously. While previous work has shown that individually authored complete files can be attributed, these efforts have focused on such ideal data sets as contest submissions and student assignments. We explore the problem of authorship attribution “in the wild,” examining source code obtained from open-source version control systems, and investigate how contributions can be attributed to their authors, either on an individual or a per-account basis.
Cornell University

How we do things with words: Analyzing text as social and cultural data

In this article we describe our experiences with computational text analysis. We hope to achieve three primary goals. First, we aim to shed light on thorny issues not always at the forefront of discussions about computational text analysis methods. Second, we hope to provide a set of best practices for working with thick social and cultural concepts. Our guidance is based on our own experiences and is therefore inherently imperfect. Still, given our diversity of disciplinary backgrounds and research practices, we hope to capture a range of ideas and identify commonalities that will resonate for many. And this leads to our final goal: to help promote interdisciplinary collaborations. Interdisciplinary insights and partnerships are essential for realizing the full potential of any computational text analysis that involves social and cultural concepts, and the more we are able to bridge these divides, the more fruitful we believe our work will be.

The (Mis)Informed Citizen: Indicators for Examining the Quality of Online News

Current discourses about the spread of misinformation tend to juxtapose misinformation with "quality" news — assuming that one is the opposite of the other and suggesting that, if we simply ensure people have better access to quality information, the harmful effects of misinformation will be mitigated.
American Public Health Association logo

Malicious Actors on Twitter: A Guide for Public Health Researchers

Social bots and other malicious actors have a significant presence on Twitter. It is increasingly clear that some of their activities can have a negative impact on public health. This guide provides an overview of the types of malicious actors currently active on Twitter by highlighting the characteristic behaviors and strategies employed. It covers both automated accounts (including traditional spambots, social spambots, content polluters, and fake followers) and human users (primarily trolls). It also addresses the unique threat of state-sponsored trolls. We utilize examples from our own research on vaccination to illustrate.
Journal of Medical Internet Research logo

Characterizing Trends in Human Papillomavirus Vaccine Discourse on Reddit (2007-2015): An Observational Study

Despite the introduction of the human papillomavirus (HPV) vaccination as a preventive measure in 2006 for cervical and other cancers, uptake rates remain suboptimal, resulting in preventable cancer mortality. Social media, widely used for information seeking, can influence users’ knowledge and attitudes regarding HPV vaccination. Little is known regarding attitudes related to HPV vaccination on Reddit (a popular news aggregation site and online community), particularly related to cancer risk and sexual activity. Examining HPV vaccine–related messages on Reddit may provide insight into how HPV discussions are characterized on forums online and influence decision making related to vaccination.
Policy Insights for the Behavioral and Brian Sciences logo

Communicating Meaning in the Intelligence Enterprise

Intelligence community experts face challenges communicating the results of analysis products to policy makers. Given the high-stakes nature of intelligence analyses, the consequences of misinformation may be dire, potentially leading to costly, ill-informed policies or lasting damage to national security. Much is known regarding how to effectively communicate complex analysis products to policy makers possessing different sources of expertise. Fuzzy-Trace Theory, an empirically-validated psychological account of how decision makers derive meaning from complex stimuli, emphasizes the importance of communicating the essential bottom-line of an analysis (“gist”), in parallel with precise details (“verbatim”). Verbatim details can be prone to misinterpretation when presented out of context.
BMJ Open logo

Can online self-reports assist in real-time identification of influenza vaccination uptake? A cross-sectional study of influenza vaccine-related tweets in the USA, 2013–2017

The Centers for Disease Control and Prevention (CDC) spend significant time and resources to track influenza vaccination coverage each influenza season using national surveys. Emerging data from social media provide an alternative solution to surveillance at both national and local levels of influenza vaccination coverage in near real time.