Learning with Space-Filling Curves
While writing a neural network program in the 1990’s, the thought occurred to me that space-filling curves (SFCs) might facilitate learning. As it happens, SFCs are not a measure of similarity but rather a foundation. For simplicity, while trying to work out the bugs and obtain an understanding how this might work, the only measure we have tried so far is nearest neighbor. The idea is that since some SFCs are one- to-one, on this side of infinity, the inputs of any output can be generated and the output space is well defined.
Here we use the Hilbert SFC because it is one-to-one short of infinity and associated with gray code. In fact, an n-dimensional Hilbert curve with binary coordinates is equivalent to gray code and the simplest model to understand what is happening with a space-filling curve.
The concept is to fill a space with a line such that each point on the line has a limit in that space. In simple terms it amounts to bending a line until the entire space is filled. Consider an n-space with binary dimensions. The inputs are gray coordinates (vGray) while the output is the Boolean length (vBool).
For binary coordinates, vGray has n! orderings each representing a different path along the “edge” of the n-space and diagonals are not allowed. In addition, nearest neighbor similarity between two sets of gray coordinates is equivalent to XOR of those coordinates counting the number of “1”s. This can be written as XOR(a, b) = similarity.
Since vBool has length 2^n it is necessary to use an iterative limit to define where one classification ends and another begins. Whereas support vectors place the boundaries in an n-space, SFCs place the boundaries in a single dimension.
For an n-space with binary coordinates, the origin vGrayOrigin is all zeros. Every vGrayNext with single “1” has the similarity of XOR(vGrayOrigin, vGrayNext) = 1; There are n possible vGrayNext, since n!/(1!(n-1)!) = n. If vGrayNext has two coordinates which are “1” there are n!/(2!(n-2)!) possible combinations. The first is iteration divides vBool into n intervals. The second iteration divides vBool into n!/(2!(n-2)!) intervals. The boundaries of the intervals of any given iteration have the same similarity.
If there are x iterations, (n!/((x-1)!(n-x-1)!) / (n!/x!(n-x)!) = x(n-x) is less than n. Thus, each iteration divides vBool into intervals and the next iteration over any one interval divides that interval into less than n subsequent intervals. Further, all the similarities between the boundaries are greater or less than the boundary similarity. This is a function which limits the XOR similarities and approaches any point in vBool in at most n iterations.
Another step is the determination of the possible classes between any two points in vBool. The two points may have coordinates in common starting from the left. Those are inserted for the left coordinates of each training example while also flipping the training coordinates as necessary to keep within the two point interval. After checking similarities, this gives all possible classes within that interval. This along with the iterative limits provides a one-to-one map from vGray to vBool such that vBool has classification intervals.
C++ code will be expanded and improve with time. It may be freely used, distributed and modified
at your discretion.
While there are no intentional errors, there are no implicit or explicit guarantees or assumptions this code has any value whatsoever.
Binary and Gray Code Example
Generate Binary Numbers
Generate Binary Numbers
Binary Number to Gray Code and vise versa
Binary Number as a length to Gray Code Coordinates of M dimensions and N Bits per dimension. This code needs revision or replacement as it is difficulty to follow.
Gray Code Measure of Similarity (XOR) This provides a one-to-one map between the inputs and outputs of the classification system.
Get Least Upper Bound Given Total Bits and the target Similarity This is for the limit function.
My office is:
Muskegon Surgical Associates
My email address may be constructed by using the first letter from my first
middle initial and the four letters of my last name "at" comcast.net
James J. Rice
Last modified 05/18/2018
Completion Date 01/01/2020