12 text processing and cognitive technologies




Название12 text processing and cognitive technologies
страница7/41
Дата конвертации08.02.2013
Размер4.58 Mb.
ТипДокументы
1   2   3   4   5   6   7   8   9   10   ...   41

ЛИТЕРАТУРА


  1. Апресян Ю.Д. Лексическая синонимия. Синонимические средства языка. - М.: Наука, 1974. – 367с.

  2. Кубрякова Е.С. Язык и знание: на пути получения знаний о языке. Части речи с когнитивной точки зрения. Роль языка в познании мира. – М.: Языки славянской культуры, 2004. – 560с.

  3. Новиков Л.А. Лексическая семантика. – М.: Из-во РУДН, 2001. – 630с.

  4. Ожегов С.И. Словарь русского языка: 70 000 слов/ Под ред. Н.Ю.Шведовой. – М.: Рус.яз, 1990. – 921с.

  5. Попова З.Д., Стернин И.А. Язык и национальная картина мира. Воронеж, 2003. – 60с.

  6. Толковый словарь русских глаголов. Под ред. Л.Г.Бабенко. – М.: Аст-Пресс, 1999. – 704с.

  7. Webster’s Dictionary of Synonyms. – Springfield, 1942. – 907с.

POLYSEMY AS COGNITIVE PROBLEM 9
Irina Frishberg10
ABSTRACT

In our article we try to view polysemy in a cognitive aspect in the English and Russian languages. To make all this perfectly clear and obvious we introduce such term as polysemy coefficient, which turned out to differ depending on the language and the part of speech. The above mentioned approach helps us to interpret the results more effectively.
KEYWORDS

Semantics, polysemy, cognitive approach, polysemy coefficient.

DIFFERENTIAL LINGUISTICS 11
Ilya Geller 12
ABSTRACT

In the course of carrying out NIST TRECs 2003 and 2004, I created and tested a computer program for textual information searches, based on ‘understanding’ the meanings of words in texts. The computer using the program ‘understands’ not only the abstract, standardized meanings of the words in the text, but the specific, concrete meanings given to those words by the author(s) of the texts. In this article I attempt to bring the language I used to create the algorithm of the program in line with the generally accepted, formalized language of mathematics. For example, I explain why I consider the paragraph to be the ‘second derivative of the function of describing Reality’, and what I understand Reality to be. (For the clarification of my understanding of Reality I apply philosophy of Cynicism.) Along with that, I bring in an understanding of the ‘prototype’ of a paragraph as it is ‘integrated’ into the first derivative of the function of describing Reality. I also show that it is precisely the existence of the ‘prototype’ that allows the computer to ‘understand’ the meanings of the words in the text.
Axiom 1. Words exist.

Definition 1. I understand a word in any given language to be a combination of letters in that format in which the word appears in print in a generally accepted dictionary of that language. That combination of letters by which the word is fixated in the dictionary is recognized as the ‘normal form’ of that word, to which all ‘non-normal’ forms of the given word can be reduced; by a ‘non-normal’ form of a word I mean a form which arises from adding prefixes, suffixes, endings, etc., to the normal form of the word; or a form resulting from the introduction of a grammatical error into the word.

Use of the dictionary of a language allows one to present each word in numerical form. Differential Linguistics thus works with numbers; and the system for reducing non-normal forms of words to their normal forms can be seen as a system for reducing words to numbers.
Definition 2. The meaning of a word is how the word is used and what the word is.
Definition 3. Any word taken separately in its normal form is a ‘non-predicative definition’. I have called combinations of normal forms of words – nouns/pronouns-verbs-adjectives – ‘predicative definitions’.

Note. Noam Chomsky. In 1957 Noam Chomsky [1] proposed calling the combinations of words that convey the meaning of a sentence ‘kernel sentences’. But I have preferred to follow an immeasurably more ancient tradition which had its beginning with Aristotle, and to call such combinations ‘predicative definitions’ (if they are reduced to their normal forms.)

Definition 4. I understand only the normal form of a word to be a non-predicative definition; where a non-predicative definition and an abstraction/universal [6; ‘The World of Universals’]. are the same thing.
I claim that any non-predicative definition has all words’ meanings.
Clarification. Philosophy. I have chosen, as an intellectual basis for my program, the philosophy of Cynicism, which I see as superseding the philosophy of Idealism.
I suppose that Cynicism, as a collection of dogmas in written form, was originally created by the Biblical authors Ecclesiastes and Jeremiah who, I think, were opposing Plato and Aristotle. I consider Cynicism to be based on the notion that the Ideal is single, doesn't change and therefore doesn't exist in time – the Ideal exists only in the immutability of eternity, in timelessness13.

I am certain that Idealists Plato and Aristotle – as well as their follower Hegel, Bradley and Russell – supposed that there exists a multiplicity of ideals, that ideals exist in time/not in time and can be distinguished one from another. (Idealists have never explained how one could make distinctions between things that were the same – absolutely identical. Hegel pretended that the question didn't exist – see the part ‘Quantity’ of his ‘The Science of Logic’: he supposed that nothing exists in time as well as beings. Russell did not think so – ‘But universals … subsist or have being, where ‘being’ is opposed to ‘existence’ as being timeless [6, p.100].’ But how a plurality of or Russell’s universals could not exist in time and continue to be the plurality?)

Having applied the concepts of Cynicism to Linguistics I have come to the aforementioned conclusion, stated in Definition Nª 13, that the normal form of a word is the abstraction/universal – that is, it has an indefinitely large and in no way distinguishable number of meanings as long as the word hasn't been combined with other words and/or given a non-normal form.

For instance, a word ‘ggffrrtte’. In its normal form it means everything and nothing at the same time – unless it is explained. Hegel said about this phenomenon: ‘… pure being is the pure abstraction … which when taken immediately, is equal nothing. From this... a definition of the Absolute followed, that it is nothing... Hence, the truth of being and nothing alike is the unity of both of them; this unity is becoming [5, p.140-141]..’ To posses a meaning of the word ‘ggffrrtte’ the word should become! I think that the inclusion of the normal form of a word into a structure in combination with other words and/or its modification into a non-normal form transforms the [abstract] word into a concrete word with a concrete meaning.
Note. Russell’s Non-Predicative Definition. Bertrand Russell introduced the notion of a ‘non-predicative’ definition, in which what is to be defined is brought in through its relation to a class of which it is an element [6,7,8,9]. For example: ‘the set of all sets that are not elements of themselves’.

But the given affirmation – ‘the set of all sets that are not elements of themselves’ – is a combination of sets of words in normal and non-normal form, intended to clarify the meaning of the word ‘set’. The word ‘set’ by itself, however, in its normal form and not in combination with other words, can implicitly carry the meaning it has in ‘the set of all sets that are not elements of themselves’ together with many other meanings14. And the same with the word ‘ggffrrtte’ till it’s explained!

To make an analogy with Set Theory: if it is given that the normal form of a word has a countable number of meanings N, then when that word is included in a combination of many words that set of meanings is reduced to the dimensions of its intersection with the sets of meanings of other words - to the set M, М  N (M is a subset of N, the power (number of elements) of M is less than or equal to the power of N).
Axiom 2. There exists a countable and limited number of parts of speech.

Definition 5. Each part of speech explains not what the word is, but only how the word is used.

Any normal form of a word can belong to one or to several parts of speech.

Postulate 1. In most cases, it is only possible to identify the part of speech to which a given word belongs by analyzing the combination of the normal or non-normal form of the given word with the normal and/or non-normal forms of other words - with the given word present explicitly or implicitly in such combinations15. In a few cases one can identify the part of speech to which a given word belongs even without analyzing the combination of the normal and/or non-normal form of the given word with other words16.

Definition 6. Words and their combinations always appear as parts of a sentence17, and sentences always form a paragraph.
Axiom 3. There is Reality18. Words, sentences and paragraphs are used for the describing of Reality.
Reality is everything that is and is not. Also, I am not interested in whether the unicorn really lives or in how long its horn is: it is a part of Reality, since Reality is everything that is and is not [6,7,8,9].
Postulate 2. Reality is always and continuously changing.

I am unaware of any unchanging Reality.
Definition 7. Context. I understand context to be the description of concrete, named parts of Reality and of what happens to them. Context can only be provided by a sum of combinations – always no less than one combination – of normal forms of words, extracted from a sentence of a paragraph of a text; where the normal form of words is arrived at by the reduction of non-normal forms to normal. Moreover, such combinations must always conform to the structure of the following triad: substantive (noun), verb, adjective.
Definition 8. Subtext. I understand subtext to be the description of unnamed and unnamable parts of Reality and of what happens to Reality and its parts. Subtext can only be provided by a sum of combinations - always no less than one combination - of normal forms of words - combinations of no less than three parts of speech - pronoun, verb and adjective.

A method and system for extracting context and subtext from texts/paragraphs is described in the article ‘The Role and Meaning of Predicative and Non-Predicative Definitions in the Search for Information’ [2]. The system consists of

  1. The reduction of the non-normal forms of the words used in texts and paragraphs to their normal forms

  2. And the compilation of the combinations of those words within the sentences of texts and paragraphs.

The method thus consists of the summarization of the combinations of normal forms of words found in texts and paragraphs (taken as sums of sentences) in order to establish the contexts and subtexts of texts and their paragraphs.

I am certain that the context and subtext of a paragraph provide a very limited set R of meanings for every word used in the paragraph, where R  M [R is a subset of M which has a power less than or equal to that of M, where R  М  N].
Definition 9. A paragraph is devoted to the description of a limited number of parts of Reality19: a paragraph, as distinct from the sentences that compose it, must in the overwhelming majority of cases have an absolutely simultaneously defined context and subtext, which allow one in most cases to understand without ambiguity the meaning of the words used in the paragraph.

If a paragraph does not have a simultaneously defined context and subtext, it means that an error has crept into the paragraph and/or it should be linked to (an)other paragraph(s) of the text.
Observation 1. Context and subtext can only be provided by a sentence, a paragraph, or a text.
Note to Observation 1. In order to provide context and subtext one needs a minimum of one combination of normal forms of words. Such a combination of normal and/or non-normal forms of words is a sentence. And a paragraph and a text are a sum of some/many sentences.

Observation 2. It is not always possible to determine to which part of speech a word belongs by examining only one single sentence in which it is explicitly or implicitly present.
Note to Observation 2. A sentence may consist of only one word which is neither a proper Name nor an appellation - i.e., it is what I call a ‘single-part sentence’. In that case it is impossible to examine the combination of words that would allow one to identify the part of speech to which the word belongs, since there is no combination of words in the sentence.

For example, if someone creates the sentence, ‘Red’, one cannot tell to which part of speech the word used in the sentence belongs. ‘Red’ could refer to a colour, or be a pejorative term for a Communist, or it could be a proper Name or an alias. In the first case the word ‘red’ would be an adjective, in the others it would be a noun.
Observation 3. The definition of how a word is used - to which part of speech it belongs - is most often made possible by an examination of the paragraph in which the word is used.
Note to Observation 3. Indeed, since a paragraph is a set of sentences – no less then one sentence – then a paragraph makes it possible to define the part of speech to which a word belongs.
The Function of the Description of Reality. Description in words is dependent on changing Reality.
I have come to the conclusion that as Reality changes the description of it always and inescapably changes as well, reflecting the changing of Reality itself. For example, if someone dropped a cup of tea on the floor and it broke, spilling the tea, then a description of the broken cup and the spilled tea would be entirely different from a description of the cup before it fell on the floor.

Moreover, out of the given set of all possible subjective descriptions of Reality E, one can relate to described Reality – let's call it x – only one subjective description of Reality, designated as y=f(x). One can then say that for the set of descriptions of Reality E  a function of description of Reality is provided in the terms

y = f(x), х  Е

where E is the field of subjective definition, the set of all possible states of Reality.
Postulate 3. Only subjective descriptions of Reality exist.
Note. Subjectivism20. The function of description of Reality in words is always subjective, since only a subject (the observer) is in a position to describe Reality in words. In any case, I am unaware of any descriptions of Reality created directly or indirectly by an object, and not by a subject. I am certain that even if objects exist that are capable of describing Reality in words, they were created and/or taught to describe Reality by a human being.

The function of the description of Reality can be provided analytically, if one considers proven the hypothesis of the presence in every subject of an individual and limited set of lexical habits [3], the existence of which is determined simultaneously by the Aesthetic and Ethical components [4] of the subject's mind. Indeed, if one knows which words and combinations thereof must inescapably be used by the subject in describing Reality (and its parts), one can analytically provide the function of description of Reality by the given subject. For this one must know

  1. The subject's emotional relation to Reality and its parts; knowledge of which is transmitted by the Ethical component of the subject's mind,

  2. And it is also necessary to know about the subject's own knowledge of Reality and its parts: this knowledge is the Aesthetic component of the subject's mind.

1   2   3   4   5   6   7   8   9   10   ...   41

Похожие:

12 text processing and cognitive technologies icon15 text processing and cognitive technologies

12 text processing and cognitive technologies icon12 text processing and cognitive technologies

12 text processing and cognitive technologies icon2. a) Skim through the text and say what the message of the text is.  5 min.) ' assets активы

12 text processing and cognitive technologies icon  ausführliche Textversion / extended text version 

12 text processing and cognitive technologies iconLaboratory of Information Technologies
Дармштадт, Германия, дается описание базирующего- ванные в последнее время обмотки с коаксиальным се
12 text processing and cognitive technologies iconПятая международная конференция по когнитивной науке
Настоящий сборник включает материалы Пятой международной конференции по когнитивной науке / The Fifth International Conference on...
12 text processing and cognitive technologies iconBold text 12{font-weight: normal;}
Трубадуры   от провансальского trobar — «находить», «изобретать», отсюда «создавать 
12 text processing and cognitive technologies iconBold text 12{font-weight: normal;}
Родился 13  октября 1933 года в Москве, в семье педагогов. Супруга Лапшинова Нина 
12 text processing and cognitive technologies iconPrecise description of your products / services / technologies for cooperation. Please, describe advantages
«V Annual International Business Partnership Matchmaking Forum «Russia - Europe: Cooperation without Frontiers»
12 text processing and cognitive technologies iconIn the article basic problems and prospects of the use of fuels are considered for ramjets. Basic technologies of receipt 
Агентства  (мэа)  кратность  мировых  запасов  и природный газ и превращая их в диоксид угле
Разместите кнопку на своём сайте:
TopReferat


База данных защищена авторским правом ©topreferat.znate.ru 2012
обратиться к администрации
ТопРеферат
Главная страница