This file contains subfile # 14



(subfile 14: the utility of collapse & some mechanics of concatenation)

It is better if the idea “to go home” becomes a unit expressed as a single word-object, rather than always existing as three separate ones. Remaining separate would require the parsing of all the words each time the phrase is seen. (Such clumping is known to be one of the ways “expert” thinking differs from a novice's.)  After a conversation has reached a certain point of completion, on the other hand, if it has been decided “to go home”, then the definition of “home”, previously irrelevant, might become crucial, requiring that the 3-word entity once again be separated out into its constituents.

Many of the word-series that are useful to collapse into single points would be referred to as "phrases" in the usual grammar-lingo. When examining input, there are some signals to the pre-processing parser that a collection of words is a phrase - for instance, the presence of an “extra” verb in a sentence or the appearance of a preposition.  By using both  repetition of the collection itself (in association with a number of other words, all of the same part of speech) and the use of clusters (p.18) the algorithm has a variety of ways to determine the boundaries of phrases.

The collection “I would like”, for example, appears over and over, followed by an article and a noun. The repeated phrase should be treated as one entity whose definition includes the analysis performed earlier by the program. (The repetition is itself an excellent indicator of phrase-hood.) We do this all the time, and sometimes there’s hardly any need to be able to decompose the phrase, ever again. We even make jokes about this; jests  of the form:

              say a silly sound,                  then establish a context that explains what the sound means

                “Jeet jet?”                                   “Did you eat yet?”

            “Juwannago?”                         “Would you like to go?”

It is noteworthy that the use of clusters to delineate phrases operates as the sentence is spoken, and not by performing some analysis after the whole sentence is available - analysis such as traditional "diagramming".