New standards needed to transform databases into effective research tools

BY By Geng Xue | 06-21-2016
(Chinese Social Sciences Today)

As database construction shifts from paper-based forms of literature collection to digitized archives, there is an urgent need for reform of standards for compilation.

 

In recent years, with the advancement of digital technology, various databases have been built to facilitate the collection and analysis of information. Official statistics show that more than 127 database projects have been completed supported by the National Social Science Fund of China (NSSFC). These databases provide ample resources for research. However, database design poses a major problem for researchers. Finding the appropriate design and methods to use databases has become the main concern, rather than figuring out whether they should be used.

 

Design
For historical researchers, embracing big data means more than just using databases as the platform for information retrieval. It is an approach to developing traditional materials. Xia Mingfang, a professor from the Institute of Qing History (IQH) at Renmin University of China with abundant experience in literature collation, is now undertaking a major project named “Annals and Database Building of Disasters During the Qing Dynasty” supported by the NSSFC. From paper-based literature collection to modern database construction, Xia contended that a new platform is essential in the new era.


Database construction consists of much more than just collecting digitized literature. Before beginning to construct a database, a detailed design is necessary. Zhang Ping, a professor from the Northwest Institute of Historical Environment and Socioeconomic Development (NIHESD) at Shaanxi Normal University, argued that many questions such as “Why should this database be built?”; “What is the design of the database?”; “Who will this database serve” and “What are the problems that can be settled through the database?” should be considered in advance.


However, for many scholars, a high-quality database is hard to create. It is even harder to build a database that will be widely cited and presents significant progress in the relevant academic fields. Hu Hengbiao, a professor from the IQH, said that from a broad perspective, some databases lack long-term design and coordination. Redundant recording is severe and a unified standard for data collection is absent, leading to a huge waste of data.

 

Compatibility
However, long-term database planning cannot be achieved in the short term. Finding ways to avoid building redundant databases and seeking larger compatibility between different databases are core concerns of many scholars.


Standardizing database-building principles makes long-term use of databases possible. Zhao Siyuan, a scholar from the Department of History at Shanghai Jiaotong University, argued that although different fields adopt various methods to construct databases, some common principles should be employed when setting up a shared data standard. When each database achieves a large scale, a linkage of these databases may profoundly facilitate research in certain fields.


Pan Wei, a research fellow from the NIHESD, said that standard criteria should not be limited to code standardization. Data collection, analysis and production need a principle involving data standards and workflow. Thorough work is essential when setting the standard criteria, Pan said.


In order to contain information of various kinds and allow for multiple uses, databases must be open to receiving feedback from users. This is the so-called “crowdfunding” framework, which provides an efficient way to improve databases.


Database construction requires close cooperation between multiple disciplines. Zhao said that the concept of digital humanities transformed the methods of database construction and development. Researchers in humanities and social science are actively participated in database construction rather than passively accepting it, incorporating database development processes into their research.

 

Functions
Digital humanities is a new interdisciplinary concept that emerged during the last two decades. The concept is widely applied in historical geography, history and especially economic history studies, bringing new ideas to these disciplines.


As digital humanities practices broaden, more analytical instruments have been applied to literature collection and interpretation. Zhao said historical literature databases are becoming an independent literary form and are much more than a warehouse of historical literature.


In addition to providing rich data resources and acting as a convenient search tool, databases promote interdisciplinary conversations. For researchers, databases broaden their horizons, avoiding fragmentation of research. In addition, awareness of problems is a necessity in database use and construction. Xu Liheng, a research fellow from the China Biographical Database research group, argued that in this era, the relationship between database construction and awareness of problems will become closer.

 

Geng Xue is a reporter at the Chinese Social Sciences Today.