Our solution to the performance issue introduced by the huge amount of
data the system would have to deal with is to create a distributed system
that would parallelize the inferences by having several Prolog programs
running on different machines making inferences on a different set of data
with different sets of rules. Each node will pass along any inferences to
other nodes that may find the information useful. For example if one node
has information that may be useful to a node making inferences about
airplanes as weapons and another node infers that someone is a potential
plane hijacker then that information should be passed along. New inferences
received from other nodes will then be added as rules. New inferences will
move their way up the tree where Prolog programs will be making inferences
based on facts that were obtained as a result of inferences at a lower
level.
Our initial plan is to use Python to handle communications between nodes.
The choice to use Python was made because Dr. Wheeler has previous
experience using a Python-Prolog combination. It is his experience that
Python’s ease of network programming and Prolog’s ease of inferences
makes the combination a good candidate for the implementation of our system.
The Python programs would be responsible for communication between nodes and
would run Prolog programs at each node. This seems like a good solution as
Prolog is great for the inference engine, but is not suited well to the
other aspects of the system.
Communication between nodes would be achieved through message passing,
which is the natural interaction mechanism of Beowulf. Communication should
be kept to a ‘need to know’ basis, as only useful inferences should be
sent to other nodes. In a large distributed system that is processing a vast
amount of information it is possible to overload the communications system
so this has to be given consideration.
One interesting aspect to the project is that each node should be able to
tell if any inference will be useful, and to what other node it would be
useful to, rather than just sending any new inference to every other node.
My initial goal is to use the UMaine Computer Science Beowulf cluster to
run the system. After each node can run small Prolog programs and can send
inferences between each other we will begin moving the existing
single-machine system to our parallel system, and make modifications as
necessary.