(Subfile 39.3: Vibration,
resonance, association, and
parts-of-speech)
Consider the resonances, vibrational partners, bombs, and Purr-puss
associations for nouns. These will all be quite different from the
resonances, bombs, etc., for verbs. Such consistent groupings of
qualities can be characterized numerically, summarized, reduced to
templates, and so on. Three types of object are easily calculated:
1) a word-type object that holds elements that are
common to all nouns
2) one that holds elements that differ between nouns
and verbs
3) one that expresses this difference functionally -
that is, the transform between them.
Forming objects that consist of the sets of these characterizations is
also straightforward. For example, "truck" would have 1) Purr-Puss
associations with items that could be cargo, 2) resonances with other
means of transport, 3) argument-partners (see "partnership" in subfile
12.25) with objects that specify destination and route, and so
on. An object consisting of this set of characterizations would have
respective coordinate sets as diagrammed below.
There are three types of association (the left-most column), each
represented in the canonical MS fashion as a "coordinate", namely, a
particular axis and a value. Then there are the three bits of content
to which the associations point (the right-most column), again
expressed as coordinates.
axis
value
|
axis
value
|
axis value
1)
technique-of-association
purr-puss |
pointer next
| contents cows
axis
value |
axis
value |
axis value
2)
technique-of-association
resonance |
pointer next
| vehicle railroad
axis
value
|
axis
value
|
axis value
3)
technique-of-association
partner |
pointer
next
|
route TBA
Restated, in 1) a truck is associated via Purr-Puss prediction with its
contents, "cows". In 2) a truck is associated via resonance with
(another) vehicle, a railroad. In 3) truck is associated with its own
route because "truck" and "route" were both necessary (together) as
arguments to some other function.
Each one of these three sequences looks exactly like a
part of a word – thus each can be transformed in all the ways words can
be: reduced to templates, etc.
The templates for these three objects are like spectra, in that the
activity
of the words (the words' relations to other words) is what is
summarized, not the meaning or content or definition. It is the
function - the activity - of words that is involved with grammatical
parts-of-speech. "Drive" interacts differently depending on whether one
means 'cause a car to move' or 'what you take on Sunday afternoon',
even though the definitions contain many identical coordinates.
Frames were introduced by Marvin Minsky in the article
"A Framework for Representing Knowledge." A frame is a data structure
for parsing knowledge into substructures by representing "stereotyped
situations," and may be connected together to form a complete
idea. The frame contains information on its use, what might come next,
and what to do when these expectations are not satisfied. Some
information in the frame is generally unchanged while other
information, stored in "terminals," usually change. (Different frames
may share the same terminals.) A frame's terminals may be filled
initially with default values, which, according to Minsky, is based on
how the human mind works. More about this program's interaction with
frames appears in subfile 11.25.
(Subfile 39.7: subset-types and vocabulary
extraction)
Why subset-types?
Parts-of-speech are obviously important and useful, even if some of the
standard groups contain elements that are quite diverse.
Music theorists need to be able to discuss functional parts of melody
and harmony, and a number of subset types exist for this purpose, for
example
Ornamental and contrapuntal functions (passing
tones, anticipations, appogiature),
Scale degree (final, tenor cadence, reciting tone,
dominant), and
Roman numeral chord analysis.
In chant, raga, and various near-Eastern melodic styles, there exist
consistently used, small groups of notes; the particular allowed set of
such subsets is determined according to mode or raga. These are well
known to the performers and constitute a vocabulary of things to “say”,
just as a rock guitarist’s riffs provide a standard group of resources.
Navigation also requires that diverse behaviors be exhibited, and that
realm, like language, is one in which the same action can perform
different functions in different contexts. For instance, a left
rotation can be part of a search, an orientation toward a known goal,
an optimal escape route, etc. The movements of a robot arm have
functional subunits as well, that must be put together into useful
(“grammatically correct”) sequences. Many of these short series become
parts of a vocabulary, and may never need to be separated out, after
they have been incorporated into some standard action.
Extraction of a vocabulary
Before a discussion of a subset-type can occur, a vocabulary must be
defined; ‘parts-of-speech’ refer to ‘words’. To a computer, a text
consisting of words appears as a sea of undifferentiated and ungrouped
characters – there’s no a priori way for the program to know that “a”
is an element that only occurs as a part of vocabulary elements, and
that “ “ only occurs as a separator. After all, “a” is just 97,
and “ “ is just 32.
How are we to extract useful sets of events from such an ocean? We
start with a logical assertion: if there are “events” in an “ocean”
then there must be some “things” separated from other “things”. The
only way to tell what “things” there are is by extended observation,
with hypotheses. We agree to start with the simplest hypotheses, thus,
we assert that there might be single character separators. For
instance, if the previous paragraph is considered, the first possible
separator is the second character, “e”. Such a separator yields a set
of possible vocabulary elements to test, as follows (hypothesized
elements here are enclosed within arrows, to make the existence of
spaces possible):
→B← →for← → a
discussion of a subs←
→t-typ← → can occur, a
vocabulary must b←
These proposed elements are then simply sought out, throughout all of
the available text, and, whenever one is found (again) we increment its
score. We can predict that the proposed vocabulary element → a
discussion of a subs← will almost never recur, and will acquire a very
low score.
The “e” separator would receive a score equal to the sum of scores for
all the vocabulary elements it proposes. The next step is to proceed to
the next possible separator, namely, “f”, and score the subset-type it
would create. We know without testing that no separator will acquire a
score even close to that of “space” when examining English texts. Thus
a vocabulary can be extracted from a sea of events such as text,
because there is a single-character separator; such a thing is a
very obvious, simple, low-level thing to look for.
If we imagine setting up a modern desktop computer to perform this
search on a sufficiently long text (maybe 10 pages?) it clearly will
discover that “space” is the best separator very, very quickly.
Imagine, then, a crawler that is expected to operate, say, overnight.
Quite a large number of separators could be tested, including all 2
character separators, as well as many in which the function of each
character changes according to some simple function.
Hofstadter’s problem
This problem can be discussed in terms of the paradigm described by
Hofstadter, in which a series of integers is presented, presumably
ordered in some way; the puzzle is to find the rules for generating the
series. For example, the following series
1.1.0.1.1.0.1.2.1.1.3.2.1.5.4.1.8.7.1.13.12.1.21.20.
must be found to be defined as: a series of triplets, all starting with
“1”, followed by a second number determined by the Fibonacci series,
followed by a third number equal to the second number minus one.
1 1
1 1 1 1
1
1 2
3 5
8
0
0 1
2
4
7
The relationships that we are asked to discover are
repeat: triplets
define triplet:
first element is “1”
second element is: start with
“1s”, and thereafter,
second_element number ‘n’ = (second_element number ‘n-1’) +
(second_element number ‘n-2’)
third element = second element
minus one
For the purposes of this demonstration, assume that we are using a base
larger than the largest number present, so that the problem of 13
followed by 12 is avoided ( that’s why I include the periods as
separators above).
A method for finding these divisions using “clusters” is described at
<>.
Here we allow ourselves to use whatever is available to a calculating
machine, but we require that we always start with the “simplest”
things. Thus if we say “you can use arithmetic operators” we start by
using only one at a time, or if any rule-type has a number in it (such
as “repeat ‘X’ five times”) we always start with the rule-form that has
the smallest number. This limitation on the search parameter arranges
it so that the simplest options will be tested first, and, if morning
arrives and the search must end, wherever we have arrived will always
be the same: “We tested everything we had time for, starting with the
simplest”.
We assume that the elements in the series are “different”. We therefore
initially seek these differences. With integers, this means
subtraction; with complex word-type objects with large numbers of
coordinates, a more complicated “subtraction” must be defined (see
“difference clusters” below).
It is very important to remember, however, that the computer doesn’t
know, or care, how complex a function is involved. In the absence of
the rule “start with the simplest version of any process you propose”,
we could test for vocabularies based on any function whatever. Such
indifference (on the part of the method as a whole) is relevant
whenever there exists some aspect of the current context that suggests
starting with something other than the simplest possibilities. In the
case of the algorithms described in this paper, there is almost always
such information.
Having selected subtraction, we apply the “vocabulary defining” method
described above.
1) choose a pair of arguments (subtraction, in its
simplest form, needs two)
and choose them the simplest way first, namely,
adjacent elements
for this example, pairs would include: 1,1 1,2
2,1 3,2 1,5 etc
2) perform your chosen function on the arguments,
defining, one by one, a set of transforms between argument_pairs
for these pairs, the transforms are (=) (+1)
(-1) (-1) (+4)
3) label each transform, and increment the score
associated with the label
4) repeat from 1)
Using adjacent elements and subtraction, the transform-score histogram
for the series above would be
8
|
7
|
6
|
5
|
4
|
3
| |
2
| | |
1
|
|
| |
|
|
|
|
|
---------------------------------------------------------------------------------------------
-6 -5 -4
-3 -2 -1 = +1 +2
+3 +4 +5 +6 +7 +8 +9 +10 +11
+12
This shows that one of the relationships (from the second_element to
the third_element of every triplet) overwhelms the other transforms
rapidly. If we the write down the series with breakers at each
occurrence of this transform, the “tripletness” of the series is also
suggested (but not completely proven, as the sequence “ 3 2 1 5 4”
confuses the issue).
1.1.0.1.1.0.1.2.1.1.3.2.1.5.4.1.8.7.1.13.12.1.21.20.
Such calculations therefore suggest two of the essential things about
the initial series, including a likely separator.
Having found a separator we have by definition found vocabulary
elements.