Towards Parsimonious Sociology Theory Construction with Neural Embeddings and Semantic Analysis  
Author Mingzhe Du


Co-Author(s) Zaid Alibadi; Jose M. Vidal; Barry Markovsky


Abstract Abstract - In the social sciences, theory construction is the process of formulating and expanding components of theories through logical, semantic and empirical analysis. The conceptual ideas of theories and their meanings are conveyed through definitions. Particularly, an important standard for evaluating the quality of theories is the principle of parsimony, which mandates that we minimize the number of definitions used in a given theory. The conventional methods for parsimony analysis in theory construction are mainly based on the heuristic approaches, which tend to produce results lack coherence and logical integrity. In this study, we propose an embedding-based approach using a machine learning model to reduce the semantically similar sociological definitions, where definitions are encoded with word embeddings and sentence embeddings. Given several types of embeddings exist, we compare the definition’s encodings with the goal of understanding what embeddings are more suitable for use in the task of parsimonious theory construction. We evaluate our approach on a sociologist annotated dataset with 2235 definition pairs drawn from the sociological literature. Our experimental results showed that the Transformer outperforms other seven embedding methods when employed with supervised machine learning models. The proposed approach achieves the best accuracy of 84.82%, comparing with Word2Vec (81.7%), GloVe (82.14%), ELMo (64.73%), fastText (79.91%), InferSent (75%), USE-DAN (83.48%) and BERT (55.8%).


Keywords word embedding, sentence embedding, semantic similarity, theory construction, parsimony
    Article #:  DSIS19-91
Proceedings of ISSAT International Conference on Data Science & Intelligent Systems
August 1-3, 2019 - Las Vegas, NV, U.S.A.