Digital and intelligent technologies empower scientometric evaluation
Fostering scientometrics is conducive to advancing the academic assessment system. Photo: TUCHONG
Sci-tech evaluation encompasses project review, talent evaluation, institution assessment, disciplinary evaluation, journal evaluation, and achievement evaluation, among many other types, and plays an important role in guiding academic innovation. Establishing a sound and science-based evaluation system is essential to China’s goal of technological self-reliance and the construction of an innovative nation in the new era. Scientometrics, an emerging discipline focused on the quantitative study of science, is increasingly integrated with and mutually promotional to the development of sci-tech evaluation. Scientometrics provides methodological support and theoretical grounding for sci-tech evaluation and management, while pressing challenges and needs for reform and innovation in sci-tech evaluation drive the continued evolution of this field.
Scientometrics is closely related to qualitative evaluation. Currently, the rapid development of data-driven technologies, such as big data, cloud computing, artificial intelligence, and blockchain, offer new opportunities for both theoretical research and practical application of scientometrics and sci-tech evaluation. In response to emerging needs, applying data-driven technologies is essential to integrating qualitative and quantitative methods deeply, enhancing the complementarity between metrics and peer review, and advancing transformative models of sci-tech evaluation. Leveraging these digital and intelligent tools to drive reform and development in scientometrics and evaluation is both a mission of the times and a response to societal demands for improved evaluation practices.
Limitations of traditional evaluation
Currently, traditional quantitative evaluation, which primarily uses bibliographic data and basic frequency statistics, has developed into a fairly comprehensive system with established theories, models, methods, indicators, data, and applications. However, despite its widespread use, traditional quantitative evaluation is often criticized for its limited accuracy and representation.
Due to technological cost constraints, traditional quantitative evaluations often rely on limited data sources and small sample sizes. On one hand, the primary data sources and subjects in quantitative evaluations are typically research outputs that can be easily quantified and their bibliographic metadata. They are characterized by coarse data granularity. On the other hand, mainstream measurement evaluation indicators are largely citation-based, such as the impact factor, Eigen factor, and h-index, along with their derivatives. This focus results in a narrow evaluation scope, lacking broader quantitative analysis of diverse text types and scientific big data. Furthermore, citation-based metrics exhibit a marked Matthew effect, whereby highly cited works continue to attract more citations, which creates an inherent limitation in traditional citation-centered evaluations.
Traditional scientometric indicators are often restricted by the availability and quality of data, typically reflecting only bibliographic data frequency. Data availability forms the foundation, while the selection of indicators remains vital. Current sci-tech evaluation indicators can be categorized as follows: first, simple quantitative indicators, which use basic frequency counts for assessment, representing the most basic form of scientometric evaluation. Second, composite indicators build on basic quantitative data by factoring in averages and time considerations, exemplified by metrics such as impact factor and CiteScore. Third, comprehensive indicators, which acknowledge that evaluations should go beyond publication counts and citation frequency, increasingly rely on holistic evaluation featuring multi-source and multi-dimensional indicators. However, simple frequency metrics based on limited bibliographic data fall short of providing a comprehensive picture. Research into alternative metrics, such as Altmetrics, which incorporate social media data horizontally, and content indicators that incorporate semantic analysis vertically, is still in its infancy.
Limited by evaluation tools and methods, traditional scientometric evaluations widely substituted macro-level measurements for assessments of individual contributions. For instance, citation counts within journals follow a skewed distribution, with a few highly cited papers driving the impact factor, while most papers have citation counts below the mean or even remain uncited. The means that a journal’s average citation rate does not accurately represent the actual citation landscape of its articles. The practice of “judging articles based on the journal” involves evaluating individual papers based on the overall quality of the journal, while common citation indicators or newly-developed Altmetrics indicators provide an aggregated assessment at the article level. These approaches produce low-precision evaluation results that fail to accurately capture the quality of individual papers.
Traditional scientometric evaluations, rooted in numerical data, fall short in assessing content quality and value. Whether citation-based indicators, webometric indicators emphasizing links, or Altmetrics relying on social media engagement, these metrics are fundamentally designed around frequency data, building evaluation models from quantitative counts. Such data inherently lacks the informational depth required to fully interpret academic contributions, as similar metrics may mask diverse behavioral patterns (e.g., the complex motivations behind citation behaviors). No matter how refined these quantitative models are, they are insufficient to uncover the intrinsic value of academic work. As scientific research develops, the volume of research results grows rapidly, and the cost of peer review also increases accordingly. With rapid growth in scientific output, peer review costs have surged, yet frequency-based metrics cannot replace the depth of peer review. Techniques such as semantic analysis and neural networks for text-based data analysis and value assessment have not yet received sufficient attention.
Multi-dimensional empowerment
The development of scientometric evaluation has generally progressed through four main stages. Before 2000, citation-based indicators dominated, with a range of influence metrics derived from citation analysis within citation index databases. Between 2000 and 2010, webometrics emerged, introducing link-based indicators as a key feature and establishing evaluation frameworks for the network environment. Between 2010 and 2020, the rise of social media fostered a more comprehensive Altmetrics evaluation system. After 2020, full-text analysis enabled by digital and intelligent technologies gained prominence, leading to a shift towards fine-grained, micro-level, and highly accurate, efficient evaluations. Here, data-driven and intelligent technologies are empowering scientometric evaluation in multiple dimensions.
First, data sources have diversified. Natural language processing and structured full-text databases have broadened the scope of evaluation from bibliographic metadata to full-text big data. Additionally, as data has become more open and accessible, evaluation has expanded beyond citation data to include multi-platform metrics from sources like knowledge-sharing platforms, academic social platforms, mass social platforms, news outlets, disciplinary exchange platforms, and video websites.. The advancement of digital and intelligent technologies has further enabled a deeper reach into scientific big data, refining traditional, coarse-grained evaluation into knowledge-unit assessments rooted in scientific big data.
Second, metrics have become increasingly semantic. With the rise of open science and advances in artificial intelligence, semantically enriched metrics containing greater informational depth have been applied. For example, citation content analysis now includes metrics that consider the position, intensity, function, and contextual semantics of citations, creating transformative evaluation indicators. In knowledge-unit analysis, natural language processing and deep learning technologies allow for a detailed metric analysis of an article’s vocabulary, sentences, and overall text structure. This approach enables a full-text, knowledge-unit evaluation of academic outputs, encompassing terms, theories, methods, tools, research problems, figures, formulas, data, highlights, contributions, limitations, future directions, and language. This integration of full-text features, semantics, and relationships between traditional metrics transforms evaluation indicators from external document features to content-based features, from syntactic to semantic, and from macro-level to micro-level.
Third, results are increasingly precise. Applying digital and intelligent technologies and combining full-text features, semantic analysis, traditional quantitative metrics, and academic exchange networks (including both traditional citation and social media dissemination networks) enable a knowledge-unit analysis that traces development pathways. This facilitates a more precise assessment of scientific achievements by evaluating quality, value, and academic standing from the “knowledge-unit” perspective. For example, structure-based indicators derived from full texts, sentiment-based thematic metrics, semantic-association composite indicators, and comprehensive metrics can provide a multi-dimensional view of an academic work’s innovativeness, disruptiveness, and contributions to both academia and society.
Fourth, evaluation services have become increasingly intelligent. As digital and intelligent technologies advance—particularly in knowledge representation, reasoning, and artificial neural networks—the presentation of scientometric evaluation results is becoming more sophisticated. The application of generative artificial intelligence further expands the potential for intelligent services. In terms of the evaluation process, digital recognition techniques and semantic description rules enable the intelligent identification and extraction of content features. Intelligent semantic similarity algorithms facilitate content comparison and feature classification, while big data mining autonomously collects multi-dimensional evaluation data, allowing digital evaluation systems to independently assess factors like innovativeness and contribution. Finally, evaluation results can be shared in real time via the cloud computing technology.
On the service platform side, digital and intelligent technologies, combined with scientific big data, cloud services, tailored evaluation models, and tiered evaluation frameworks, allow the creation of a customized scientometric evaluation system model. A cloud management layer can be used to regulate and manage the scientometric evaluation cloud service platform, cloud storage pools, and cloud cluster computing platform, providing a service that is personalized in content, diverse in form, and proactive in delivery.
Yang Siluo is a professor from the School of Information Management at Wuhan University.
Edited by ZHAO YUAN