> Features > Special Coverage > Others

Big data analysis for Chinese political science

ZHENG LI | 2022-10-13 | Hits:
Chinese Social Sciences Today

The Global Digital Economy Conference in Beijing, July 28th Photo: CFP

Unlike cross-sectional and time-series data, panel data simultaneously consists of multiple cross-sectional analysis units and time-series. Based on its characteristics, panel data not only provides a set of more comprehensive data, but also is conducive to scientific causal-effect analysis. Big data analysis is of significant strategic importance to promoting the scientific development and expansionary applications of Chinese political science.

Data-driven empirical research

The modern world is currently experiencing the fourth industrial revolution, therefore, applying information and intelligent technologies to respond to major social issues has naturally become an epochal task for social science researchers. Political science, as one of the disciplinary pillars, shoulders the same responsibility.

In the latter half of the 20th century, deeply influenced by the “behavioral revolution,” modern political science’s research focus has shifted gradually from macro theory construction and descriptions of institutions to providing solid evidence for traditional theories and political practices through scientific methods. Currently, the goal of data-driven empirical political science research is to surpass finding correlations between variables, and to seek the construction of political propositions and theories with rigorous causal relations.

Causal revolution

Causal reasoning is the core of scientific research. Searching for solidified causal relations—both theoretically and practically—is the central task of political science and even social sciences. Causal reasoning pushes the research hypothesis from correlations to causal relations.

To establish a causal relationship in a research hypothesis, three conditions should be met. First, there should be a high level of correlation between independent and dependent variables. Second, a causal relationship requires a sequence of two events where the reason happens first, and the consequence occurs later. Third, the causal relationship requires an authentic and reliable causal mechanism to exist between two events or two variables.

Panel data for causal reasoning

The same unit has been observed multiple times to acquire panel data, so that it is possible for us to control variables that are hard to observe, and assess the lag in behaviors and results. Panel data analysis technologies can observe differences in a country’s domestic policies to conduct causal reasoning by ignoring differences among countries. This analysis method is called a panel fixed effect model.

Advanced methods for causal reasoning which use panel data include: the Difference-in-Difference Method and the Synthetic Control Method.

The Difference-in-Difference Method estimates a policy’s effect on two units over a period of time through two fixed effects. When the two units follow a counterfactual hypothesis, the estimated value is the effect of the policy.The Synthetic Control Method is an expansion of the Difference-in-Difference Method. In social science research, finding a control group that conforms to the parallel hypothesis often proves difficult. Through endowing other comparison units with a certain weight, the Synthetic Control Method generates a mock unit that is similar to the experiment group to facilitate the causal reasoning. The panel data can be employed to address other issues in causal reasoning by nesting an Instrumental Variable Method and a Lagged Dependent Variable Method.

Panel data analysis has greatly expanded casual reasoning in the application of social sciences, including political sciences.

Paradigm shift

The arrival of the big data era has driven the research paradigm revolution in political science. Big data analysis has further promoted the “theoretical hypothesis-driven” political science research paradigm, providing reliable and solid evidence for political decision-making and policy enforcement.

When wedded to machine learning, cutting-edge big data research will play a key role in major political issue prediction.

Employing regulated research design and scientific data analysis clearly demonstrates the logic behind political phenomena, to bring about innovation in political knowledge.

Big data analysis enjoys an enormous application space in the scientific prevention of COVID-19, primary-level governance in rural areas, social security and social distribution, and the construction of the Belt and Road Initiative.

Zheng Li is an associate professor from the School of Government at Yunnan University.



Edited by ZHAO YUAN