Automatic extraction and classification of causal relations in text has been an important yet challenging task in natural language processing and understanding. Early methods back in the 80s and 90s (Joskowicz et al., 1989; Kaplan and BerryRogghe, 1991; Garcia et al., 1997; Khoo et al., 1998) mainly relied on defining hand-crafted rules to find cause-effect relations. Starting 2000, machine learning tools were utilized in building causal relation extraction models (Girju, 2003; Chang and Choi, 2004, 2006; Blanco et al., 2008; Do et al., 2011; Hashimoto et al., 2012; Hidey and McKeown, 2016). Word-embeddings and pretrained language models have also been leveraged in training models for understanding causality in language in recent years (Dunietz et al., 2018; Pennington et al., 2014; Dasgupta et al., 2018; Gao et al., 2019).
Investigating the true capability of pretrained language models in understanding causality in text is still an open question. More recently, Knowledge Graphs (KGs) have been used in combination with pretrained language models to address commonsense reasoning. CausalBERT (Li et al., 2020) for guided generation of Cause and Effect or the model introduced by Guan et al. (2020) for commonsense story generation are two examples.