Translation and Technology Essay

This page intentionally left blank Introduction For over half a century, the demand for a variety of translations by different groups of end-users has enabled many types of translation tools to be developed. This is reflected in the systems that will be discussed in this book, ranging from machine translation systems, computer-aided translation tools and translation resources.

The majority of books and articles on translation technology focusing on the development of these systems and tools have been written from the point of view of researchers and developers. More recent publications written with translators in mind have focused on the use of particular tools. This book is intended as an introduction to translation technology for students of translation. It can also be useful to professional translators and those interested in knowing about translation technology. A different approach is taken in that descriptions of particular tools are not provided, and the development of different machine translation and computer-aided translation tools and their uses are discussed.

Programming details and mathematical equations are not considered, except in the discussion of the statistical approach to machine translation where minimal essential formulae are included. Descriptions are given to allow readers to further investigate specific approaches or issues that might interest them, using references cited throughout the book.

It is also important to note that no particular approach or design is deemed to be better than any other. Each and every one has their strengths and weaknesses. In many cases, readers will find that examples of systems and tools are given but this does not suggest that they are the best; they are simply examples to illustrate the points made. 1 2 Translation and Technology.

While researching this book, I discovered that the majority of publications from the literature on translation technology are about the development of machine translation systems, primarily involving experimental systems developed or being developed at a number of universities and large commercial corporations across the globe. The book will show that many of these systems never achieved their commercial potential and remained as experimental tools, while some others served as tools for other natural-language processing applications. By contrast, not much literature seems to be available on computeraided tools such as translation memory systems.

As we shall see in this book, most computer-aided translation tools are developed by commercial companies and, as a result, progress reports on these tools are rarely published in the public domain. Furthermore, to cater to different needs and demands, a tool like a translation memory system comes in many versions from the most basic to the most advanced. Insights into the use of these tools can be found in translator magazines and occasionally also posted on the World Wide Web (WWW). The evaluation of translation tools falls into a field that is wellresearched. Again we will see that most of the literature focuses on the evaluation of machine translation systems.

Furthermore, the extensive use of translation tools and translation processes involved in the localization industry tend to be discussed separately, giving the impression that they are not related to translation. These two areas are, however, directly relevant to translation technology.

Hence they are also included in this book. Essentially, the book contains what is felt should be included in order to provide an overview of translation technology. In order to keep the book at the given length, the topics have been carefully selected with some described in greater detail than others. In some chapters, an abbreviated historical background has been deemed necessary in order to provide a better understanding of the topics discussed, especially in the description of the development of machine translation systems and their evaluation.

However, in all cases, references have been provided which readers may choose to pursue at a later time. Suggestions for further reading are provided at the end of every chapter (Chapters 1 to 6). The first chapter discusses the definitions of terms referring to the use of computers in translation activities. Some of the terms can be confusing to anyone who is unfamiliar with translation tools. In some cases, the same translation tools are given different names depending on what they are used for; in other cases, a tool may be differently classified depending on the perspective of those who have developed that tool. Introduction 3 The aim in this chapter is therefore to clarify these terminological and related matters.

An alternative perspective to the four basic translation types – fully automated high-quality machine translation, human-aided machine translation, machine-aided human translation, and human translation – first proposed by Hutchins and Somers (1992) is introduced to reflect current developments in translation technology. This will be explored in more detail in the final chapter where the four translation types are reviewed in relation to topics described in the book.

The second chapter discusses technology within the larger framework of Translation Studies as a discipline, focusing on the relationship between the engineering of translation technology, on the one hand, and Translation Studies including translation theory, on the other hand. The relationship between academic and professional groups involved in translation is also examined.

This in turn leads to a discussion of the involvement of a particular approach in linguistic theories – known as ‘formalisms’ in natural-language processing – especially in the design of machine translation systems. A different perspective on the translation process involving pre- and post-editing tasks using a special variety of language called ‘controlled language’ is also presented. This translation process is described using the translation model proposed by Jakobson (1959/2000), a translation model that differs significantly from the one proposed by Nida (1969). The third chapter gives detailed descriptions of different machine translation system designs also known as ‘architectures’.

The development of machine translation over several decades, its capabilities and the different types of machine translation systems, past and present, are also included. Both experimental and commercial systems are discussed, although the focus is on the experimental systems. Even though machine translation has been well-documented elsewhere, a discussion is deemed to be important for this book. It is felt that modern-day professional translators should be informed about machine translation systems because there is every reason to believe, as we shall discover in Chapter 6, that future trends in translation technology are moving towards integrated systems where at least one translation tool is combined with another, as is already the case in the integration of machine translation with translation memory.

The fourth chapter describes the architectures and uses of several computer-aided translation tools, such as translation memory systems, as well as resources such as parallel corpora. Unlike machine translation systems, which are largely developed by universities, most computeraided translation tools are developed by commercial companies. Thus, 4 Translation and Technology information about such tools is harder to obtain. This chapter will also show that computer-aided translation tools are becoming more advanced and using different operating systems, and so ‘standards for data interchange’ have been created. Three different standards are described. Currently available commercial translation tools are also discussed.

In addition, this chapter presents an overview of other commercially available tools such as those used in the localization industry. The fifth chapter touches on the evaluation of translation technology. The discussion focuses on different groups of stakeholders from research sponsors to end-users. Also included in the discussion are the different methods of evaluation: human, machine, and a combination of human and machine as evaluator. The choice of method used depends on who the evaluation is for and its purpose. It also depends on whether an entire tool or only some components are evaluated. Also described in this chapter is the general framework of evaluation offered by various research groups in the USA and Europe.

The literature on evaluation concentrates on the evaluation of machine translation systems either during the developmental stage or after the process of development is completed. Less information is available on the evaluation of computeraided translation tools. What is available is found mainly in translation journals, magazines and newsletters. The sixth chapter presents some recent developments and shows the direction in which translation technology is heading, in particular regarding the future of machine translation systems that are now incorporating speech technology features. The integration of speech technology and traditional machine translation systems allows translation not only between texts or between stretches of speech, but also between text and speech.

This integration is proving to be useful in many specific situations around the globe especially in international relations and trade. This chapter also looks at research projects in countries that are involved in the development of translation tools for minority languages and discusses the problems encountered in developing machine translation systems for languages that are less well-known and not widely spoken. Another form of technology called the ‘Semantic Web’ that has the potential to improve the performance of certain machine translation systems is also described. Included in this chapter, too, are issues such as linguistic dominance and translation demands on the WWW that are already shaping parts of the translation industry.

The book concludes by presenting an expanded version of the four basic classifications of translation types as suggested by Hutchins and Somers (1992) and introduced in Chapter 1. It is concluded that the Introduction 5 one-dimensional linear continuum originally proposed is no longer able to accurately reflect current developments in translation technology.

Translation tools today come in different versions and types depending on the purposes for which they are built. Some are multifunctional while others remain monofunctional. An alternative way must therefore be found to depict the complexities and multidimensional relationships between the four translation types and the topics discussed in this book.

It is not possible to put every single subject discussed here into one diagram or figure, and so, in order to gain a better understanding of how the issues are related to one another, they are divided into groups. Topics or issues in each group have a common theme that links them together, and are presented in a series of tables. However, it is important to bear in mind that not all topics can be presented neatly and easily even in this way.

This clearly shows the complexity and multidimensionality of translation activities in the modern technological world. At the end of the book, several Appendices provide information on the various Internet sites for many different translation tools and translation support tools such as monolingual, bilingual, trilingual and multilingual dictionaries, glossaries, thesauri and encyclopaedia.

Only a selected few are listed here, and as a result the lists are not exhaustive. It is also important to note that some Internet sites may not be permanent; at the time of the writing, every effort has been made to ensure that all sites are accessible.

1 Definition of Terms In translation technology, terms commonly used to describe translation tools are as follows: • • • • • • machine translation (MT); machine-aided/assisted human translation (MAHT); human-aided/assisted machine translation (HAMT); computer-aided/assisted translation (CAT); machine-aided/assisted translation (MAT); fully automatic high-quality (machine) translation (FAHQT/FAHQMT). Distinctions between some of these terms are not always clear.

For example, computer-aided translation (CAT) is often the term used in Translation Studies (TS) and the localization industry (see the second part of this chapter), while the software community which develops this type of tool prefers to call it ‘machine-aided translation’ (MAT). As the more familiar term among professional translators and in the field of Translation Studies, ‘computer-aided translation’ is used throughout the book to represent both computer-aided translation and machine-aided translation tools, and the term ‘aided’ is chosen instead of ‘assisted’, as also in ‘human-aided machine translation’ and ‘machine-aided human translation’.

Figure 1. 1 distinguishes four types of translation relating human and machine involvement in a classification along a linear continuum introduced by Hutchins and Somers (1992: 148). This classification, now more than a decade old, will become harder to sustain as more tools become multifunctional, as we shall see in Chapters 3, 4 and 6. Nevertheless, the concept in Figure 1. 1 remains useful as a point of reference for classifying translation in relation to technology.

6 Definition of Terms 7 MT CAT Machine Fully automated high quality (machine) translation (FAHQT/ FAHQMT) Human-aided machine translation (HAMT) Machine-aided human translation (MAHT) Human Human translation (HT)

MT = machine translation; CAT = computer-aided translation Figure 1. 1 Source: Classification of translation types Hutchins and Somers (1992): 148. The initial goal of machine translation was to build a fully automatic high-quality machine translation that did not require any human intervention. At a 1952 conference, however, Bar-Hillel reported that building a fully automatic translation system was unrealistic and years later still remained convinced that a fully automatic high-quality machine translation system was essentially unattainable (Bar-Hillel 1960/2003: 45). Instead, what has emerged in its place is machine translation, placed between FAHQT and HAMT on the continuum of Figure 1. 1.

The main aim of machine translation is still to generate translation automatically, but it is no longer required that the output quality is high, rather that it is fit-for-purpose (see Chapters 2 and 3). As for human-aided machine translation and machine-aided human translation, the boundary between these two areas is especially unclear. Both classes are considered to be computer-aided translation as indicated in Figure 1. 1 (Tong 1994: 4,730; see also Slocum 1988; Hutchins and Somers 1992). However, in Schadek and Moses (2001), a different classification has been proposed where only machine-aided human translation is viewed as synonymous with computer-aided translation. Human-aided machine translation is considered as a separate category.

The reasoning behind the view offered by Schadek and Moses is not difficult to understand. At least theoretically, the difference between the two is obvious. For human-aided machine translation, the machine is the principal translator, while in machine-aided human translation it is a human. In practice, however, it may be less easy today to draw a distinguishable boundary between them. The blurring of boundaries is further complicated when human-aided machine translation is considered as a subclass of machine translation, an approach chosen by Chellamuthu (2002). Since human-aided machine translation has 8 Translation and Technology the machine as the principal translator –

