(you may click the number of the subfile to be viewed, or scroll down)

This file contains the following subfiles:
 
15 -
collapse of sentences without information loss
15.25 - a limit on summing functions
16 - phrases in all three environments
16.5 -  two methods for calculating phrase boundaries: examples
 


(subfile 15: collapse of sentences without information loss)

It is not necessary to lose any information when words are put together into summed objects. To avoid such losses it is necessary   1) to retain information contained in the order of appearance of the addends and 2) to take care of axis duplications. We want to arrive at unique MS locations for sentences that may have 1) the same set of words in differing orders, or 2) that have overlapping axis sets.

The first problem is trivially solved by inserting locator codes (e.g. “first word in second sentence”) for the various elements as they are added. Such codes insure that summations of the two sentences “Men like women” and “Women like men” have different MS locations.


There are two ways to address the second problem.

1)
Originally I was using some sleight-of-hand. If some of the words being combined include common axes in their definitions, then the program is confronted with the need to store, or combine, two values for a single axis; unfortunately calculations must constantly be done that require each axis to have a single value. Combining them – say, by averaging –  loses some information.

The saving grace is that this algorithm operates primarily by retrieving items stored in a content-addressable fashion: most functions require the storage and manipulation of objects whose location is thus encoded. The program doesn't depend on being sensible or intelligible to us. Therefore a solution that seems messy to us may be perfectly adequate for everything the program has to do.

In the case of this second problem, the question is "how can we set up storage of a summed object when multiple addends share axes?" The answer is to add an axis - replacing one of the duplicates with a dummy that has the same properties as the original, but a different ID number.

Axes have properties (held in their word-form definition) and they exist at differing angles with other axes. They are identified by an I.D. number. To solve the two-value problem, we simply create a new axis, with the same definition and angles as the original, with a different I.D. number. All the requirements of unique location in the MS content-addressable memory are satisfied by this kluge. In any kind of "real" space, this would result in two points at the same place, but it will map to different memory locations perfectly well.


2)
As the proposed data structures migrated more and more towards the same structure, it became clear that axes could have meaning added using the same procedure as is used for verb-forms, plurals, possessives, and homonyms. In those cases, new words are temporarily created that consist of the dictionary definition of the root form plus  a minimal word-structure (called a "plus-word") containing the information needed to transform the ur-form into the inflected form. Since axes are defined in the same ways as words, it is sensible to inflect an axis with a plus-word for "also" to the duplicated axis' definition. This has a very similar effect to method number one, but is more readable and consistent.






(subfile # 15.25 – a limit on summing functions)

It is important to remember that "definitions" change according to context. Carried to its fullest extent, allowing definitions to change in this way makes a word equivalent to a frame, and the summing of frames is much more complicated and more likely to involve the difficulties of incompatible and duplicated axes. Thus summing is a function that works better when it takes local, context-delimited definitions as arguments, rather than full definitions.

Elsewhere I describe the expansion of a word into an object that is similar to the classical AI concept of frame: pointer-operators can exist within the levels of a word's definition, and if such a pointer exists without its associated value, it is equivalent to a "frame terminal". Such terminals are also easy to associate with default values, lists of exceptions, etc. (See subfiles 29 {third example}, subfile 49, and "Parts-of-thought and classical A.I. frames", p.20, main file.)




(subfile16: phrases in all the environments)

In European, Indian, and Arabian monophonic music, there exist small clusters of events that recur as units. In classical North Indian music, each Raga includes, as part of its definition, numerous such entities, and, very similarly, each mode in Gregorian Chant has its own characteristic set of short phrases. Likewise, the control of a robot arm entails many collections of primitive operations – collections that are learned once and then are never
again used in their decomposed form. Whole sequences of navigational commands were learned by ROBOT (p.42, main file)  in single trials, and were then available forever as single "words".





(subfile 16.5: two methods for calculating phrase boundaries)

I am not referring to complete, formal grammatical entities such as “noun phrase” or “predicate phrase” but rather to those repeated word sequences that the program might usefully concatenate into single units of meaning. Consider the following sentence fragments – each of which is likely to recur in the training of a program like this. In each case the phrase I wish to address is underlined, and a possible completion of the sentence – not relevant right now – follows in parenthesis.

Do you mean (”genes” or “jeans”)

I don’t know (George.)

When we were talking (before, you said......)

In each of these cases, a running sum of the axes presented by the words would encounter no repetitions of axes until the part of the sentence in parentheses. The three words “do you mean” share no axes of definition, but the arrival of a noun (”genes”) does share some with the pronoun “you”. Thus as soon as "you" arrives, the program can sense an articulation in the sentence structure. It amounts to an exclusive-or operation on axis ID numbers - not exactly rocket science. This sort of calculation provides ways of suggesting points of division within a sentence. There exists a way to test these functions before run-time (see the fifth mode of learning, p.33, main file).


The second method is less connected to word definitions, and is a simple example of the use of a minimal running cluster in this program. Part of the enhanced Puss module developed in the 1980's includes a field for the storage of the number of times a datum gets stored. An extremely simple, low-level Puss whose window includes the most recent two words of input would exhibit a useful behavior that could trigger more sophisticated phrase-detection methods. Consider the sentence "I was reading to the little girl." The 'number of times a datum gets stored' would be

     =small after the window "reading to"
    =very large after the words "to the"
    =very small again after "the little".

These two large changes would occur frequently at phrase boundariess, and it is trivially simple to set up a program module that can graphically display this sort of quantity as a "real" conversation is scanned. Such a module allows the program designer quickly to evaluate the utility of specific clustering functions.