8. Informationswissenschaft als Brückenwissenschaft

Approaches to sense disambiguation with respect to automatic indexing and machine translation

1. Introduction

Heinz-Dirk Luckhardt


Since the end of the 60s research and development work in the field of natural language processing (NLP) is being done at the University of Saarbrücken. Since then, the problem of ambiguity in natural language has run like a red thread through many projects and system developments in a number of which the author has been a collaborator (cf. the SUSY project). The NLP community is well aware of the problem, but generally applicable solutions are rare and often aim at specific problems or domains where ambiguities can be kept at a minimum. The present paper will try to give an impression of some ways by which ambiguities may be tackled and in what direction further research may go. The NLP areas in question will be natural language parsing, machine translation and automatic (multilingual) indexing. I shall try to take into account those criteria relevant for machine translation and automatic indexing.

Four different fields will be discussed, different with respect to the kind of ambiguity and the criteria of disambiguation. The first case takes into account morphological and simple syntactic criteria and deals with tagging (just a brief account). The second one deals with lexical issues and a whole complex of criteria contained in the sublanguage notion. The third one goes beyond syntactic criteria and tries to go a short step towards a semantic interlingua. The fourth one goes beyond the sentence level and introduces an idea to employ thesaurus relations for lexical disambiguation.