Big data gathered from social media can be used to analyze people’s moods and social behavior and cultural concepts. Photo: CFP
Throughout the development course of computational social sciences, many different big data forms have been developed, such as Google Books and Wikipedia. Among these formats, the big data of public opinions gathered from Twitter, Facebook, and online news media outlets, have become important objects to examine in computational social science. This article tries to make a concise analysis of the big data of public opinions and its application in the field of social sciences.
Discourse is a primary attribute to analyze. The digital text expresses a variety of opinions, attitudes, and standpoints, which qualify as discourse—different social subjects expressing different views based on their standings. Therefore, discourse analysis should be the first dimension of a big data analysis of public opinions. Using this framework, confrontations and power struggles between different discourse subjects can be presented, as well as a thorough mapping of their social influences.
The next feature is an analysis of emotional attributes. Public opinions are filled with an abundance of emotional expressions. In terms of information on social media, people usually directly express their thoughts, living conditions, and lifestyles. These self-expressions are usually straight from their hearts, sharing emotions of either joy or sadness, shock or wrath.
In recent years, affective computing, also known as emotion detecting AI technology, is rapidly emerging in the field of natural language processing. At first, texts were read by Linguistic Inquiry and Word Count (LIWC), WordNet, or other applications, to count the percentage of words that reflect emotions. Today, machine-based learning and Bidirectional Encoder Representations from Transformers (BERT) models are used to conduct elaborate analysis of emotions. Several emotion analysis technologies are developing at a rapid pace. Regarding the content of affective computing, the technology could only match words in basic categories (positive and negative emotions) at the beginning, but now other emotions such as happiness, anger, sadness, love, and scarcity can also be computed. As the technology further develops, more sophisticated emotions such as admiration, jealousy, and hatred can be computed, which extends possibilities. As the Chinese-American AI expert Li Feifei said, after experiencing visual computing, the next priority for AI is affective computing.
The next feature is related to attributes of communication. Social media platforms involve rich communication phenomena. The communication factors not only include discourse, but also the dissemination of emotions. For example, feelings of panic spread extensively during the COVID-19 pandemic period. A glance at these miscellaneous communication phenomena show that most communicative information vanished amidst the vast oceans of content, while some elements carried stronger vitality and succeeded in going viral.
The key issues here are which factors determine the effects of communication, and which factors constitute the landscape for communication. Take the political doctrine of populism as an example. Why did this kind of discourse sweep across cyberspace, moving around the globe within a very short period of time, to form major social schools of thought that rewrote history? Other issues are: what forces manipulate information communication within cyberspace? What roles do the social capital, the government, and social organization play in the communication process?
Next, we must examine big data from a sociological perspective. Big data about public opinions provides a snapshot of the lifestyles and living conditions of different social classes and groups. Based on this, methodologies from traditional social sciences can be adopted, such as class analysis and group comparative analysis, to determine the political and social attitudes of different groups and interactions between different classes. In addition, state-society relationship modes can be summarized, to infer the logic that underpins social functioning and social structures.
The next important feature is the global nature of big data. The internet has no borders, and public opinion information has increasingly become interconnected across the globe. An increasing number of social media platforms have opened access to many countries around the world, so an event that inspires widespread discussion may exert an influence on distant countries.
In such a context, big data about public opinions carries more and more global attributes, which is convenient for social sciences as they conduct comparative studies on multiple countries by means of these globally open databases. An example is that researchers Scott Golder and Michael Macy, used Twitter posts to analyze individuals moods from continents around the world and they identified consistent variations in people’s emotions—people across the globe display similar rhythms to their moods, based on the time of the day or night. Similar to this, it is possible to study the values and cultural concepts of people from different corners of the earth. Integrating global vision into empirical analysis in the social sciences may support broader research scopes for future social sciences.
Gong Weigang is from the School of Sociology at Wuhan University.
Edited by BAI LE