Promoting human-machine collaboration in interpreting

By LU XINCHAO / 06-27-2024 / Chinese Social Sciences Today

Human-machine collaboration may change the landscape of language services. Photo: TUCHONG


In the past century, information and communication technologies have enabled significant advancements in interpreting. In recent years, the fields of big data and artificial intelligence (AI) have flourished, and speech recognition, natural language processing, and other technologies have made rapid progress. The groundbreaking development of neural network machine translation has greatly enhanced capabilities of machine interpretation, raising the question of whether machines will replace human interpreters altogether. It is particularly necessary to analyze, compare, and differentiate the processes, capabilities, quality, and outcomes of human and machine interpreting to identify their respective strengths and weaknesses and provide insights into their future development. 


Process comparison

Machine interpreting refers to the real-time automated voice translation of the source language (SL) using a computer system, comprising three main modules: speech recognition, machine translation, and speech synthesis. Machine interpreting is fundamentally based on existing, limited, and fixed usage or translation methods in specific contexts or situations. In contrast, human interpreting is a dynamic cognitive process that relies on human intelligence. It integrates knowledge, experience, context, and situational awareness to call upon various cognitive resources for SL comprehension, information storage, language conversion, and target language (TL) delivery. Fundamental differences exist between human and machine interpreting in terms of the processing objects, layers, paths, and mechanisms, and the application of strategies.


Core ability comparison

The core capabilities of interpreting include three areas: message processing ability, resourcefulness, availability and associative fluency, and verbal fluency. These three capabilities correspond to three aspects of the interpreting process: SL comprehension, bilingual conversion, and target language (TL) delivery. They also correspond to the three dimensions of interpreting quality: accuracy and completeness of content, correctness, authenticity, clarity, and efficiency of language usage, and clarity, fluency, and rhythm of delivery. In addition, interpreting ability also encompasses abilities such as information storage, communication, strategic application, rapid learning, and stress resistance. The following are the specific advantages and disadvantages of interpreters and machines in terms of the interpreting competence.


Interpreters possess rich cognitive resources, deep understanding, and high tolerance for errors (ambiguity), efficient and diverse expression, and high availability. They excel at using strategies and can undertake complex translation tasks. They also possess interpersonal communication skills, identity positioning, and emotional warmth. However, the cost of using human interpreters is higher. Physiological and psychological factors, SL variability, contextual factors, work duration, workload and other factors may constrain translation performance.


Machine interpreting has lower overall costs, immense storage capacity, higher learning efficiency, faster processing speed, full automation, and is not affected by physiological and psychological factors, work duration, or workload, resulting in more stable performance. However, machine interpreting also has the following shortcomings: difficulty with processing speech rhythm information and multimodal information beyond the SL; high dependency on speech quality; difficulty with new terms, proper nouns, low-frequency words, complex structures, and colloquial expressions due to training data limitations; primarily focusing on literal translation, lacking processing capability for pragmatic information, emotional cues, metaphorical expressions; inadequate on-the-spot strategic abilities; difficulty with undertaking complex translation tasks and engaging in real-time communication with speakers and audiences. The evaluation of interpreting quality generally involves three levels: interlingual level (comparison of the SL and the TL), intralingual level (the sounds, language, and logic of the TL), and tool level (the comprehensibility and usability of the TL).


Through initial empirical research, the author found the following tendencies. Machines are highly sensitive to prosody, whereas human interpreters are moderately sensitive. Machines are more sensitive to language (formality, standardization, clarity, complexity, flexibility) compared to human interpreters, who are less sensitive. Machines are less sensitive to content (subject knowledge expertise, etc.), while human interpreters are more sensitive. Machines are not very sensitive to time-related factors (speech rate, proposition information density, information component density, etc.), while human interpreters are extremely sensitive. 


The above comparative analysis indicates that there is considerable complementary cooperation space between humans and machines, particularly in the field of simultaneous interpretation. 


Future research on machine translation should focus on simulating the advantages and capabilities of human interpretation mechanisms. The goal should be to achieve multi-scenario human-machine collaborative applications, transitioning from single-SL speech processing to multi-modal information processing, from language processing to pragmatic information processing, and shifting the focus from speech recognition and machine translation to learning and simulating human cognitive processes and abilities. Applications should move from simple single-scenario and single-mode operations to customizable complex scenario and composite operation applications, and from theoretical model construction to meeting diverse real-world communication needs. 


Lu Xinchao is an associate professor from the Graduate School of Translation and Interpretation at Beijing Foreign Studies University. 


Edited by ZHAO YUAN