TITLE OF PAPER: PyChinko: A Native Python Rule Engine
URL OF PRESENTATION:
_URL_of_powerpoint_presentation_
PRESENTED BY: Yarden Katz
REPRESENTING:
_name_of_the_company_they_represent_
CONFERENCE: PyCon 2005
DATE: March 24, 2005
LOCATION: Grand Ballroom, GWU
--------------------------------------------------------------------------
REAL-TIME NOTES / ANNOTATIONS OF THE PAPER:
{If you've contributed, add your name, e-mail & URL at the bottom}
Goals: Semantic Web rule engin
e
Limited resource environment
Beat CWM
Some common applications for rule systems:
processing web content
automating software compilation
foundation for expert systems
Basic syntax of rules:
A left hand side (LHS) - conditions that must hold for the rule to file
A right hand side (RHS) - set of actions taken when the rule is fired.
Examples of rule syntaxes.
How to process rules efficiently:
Charles Forgy's Rete algorithm
Builds a discrimination network (decision tree) for channelling facts only to relevant rules
Trades memory for speed
Especially efficient when there are many different rules.
Detailed explanation of Rete algorithm
Digression on the Semantic Web
Recall orignal motivation: rule engine copatible with SW languages and more specifically, faster than a slow but popular Python rule engine for the SW called CWM
SW Goal: give useful semantics to otherwise meaningless (to a computer) documents: allow for intelligent software agents to do some things that would (now) require a human eye and brain
In short, bringing KR/AI techniques to the web (and leaving the hype at home!)
Giving a story about looking up images of flowers and even though the text search is good for documents it would not work for an image that has a flow but other text information and the SW would allow for meta information that could be mined by rule engines that would grok the meta data and find the image.
There is a lot of hype and what they are trying to do is bring rules and standards to the open source world to allow for quicker acceptance.
Digression (cont'd): Semantic Web & Rues
Core language is RDF (), a W3C standard; serializable into triples of the form (subject predicate object)
Knowledge representation language called OWL () layered on top of RDF
Example: FOAF is an OWL ontology. Large instance data (RDF documents) for FOAF generated by LiveJournal, among others.
Taste of SW languages
RDF describes ressources (anything w/URI, essentiallY)
Assertion: Noam Chomsky is the author of Manufacturing Consent, published 1988
One way to think of it: there exists something in the world,
<image of the book here>
<example of a RDF description of the assertions above>
or in N3 (which would you prefer writing?):
@prefix dc: <http://purl.org
<http://www.chomsky.info/ManuCons>
dc:creator "NOam Chomsky";
dc:title "Manufacturing Consent";
dc:date "1988"
<image of the graph that represents that>
CWM (closed-world machine) is a python rule engine for SW that uses naive forward chaining algorithm
Compared with our Rete(on 1200 facts KB w/two matcha ll rule):
2400 inferred facts (double original)
Pychinko time: 0m1.609s
CWM time: 0m28.691s
Increase to 70000 initial facts, our time: 1.729s CWM time: 14m58.707s(!)
This is one of Rete's worst cases
Applications of rules (one the web)
Social networking: Processing FOAF documents. foaf is a semantic web vocabulary for forming social networks. consider a web application performing social network analysis on large, spidered foaf documents. an intelligent filtering mechanism to avoid 'bland' documents (perhaps autogenerated by a different application) unfit for analysis is needed. a filtering mechanism of this kind might follow foaf:
Trust: policies on the web; permission/restrictions, I grant access to X only to senior employees - otheres can only view database Y and Z" or "Only known FOAF friends can view my pcitrue gallery" are natrually captured in the form of rules. The rules are easily expressed in Pychinko, while the terms to which the rules refer are not
Future work (in progress, actually)
Generalizing PYchinko (native Python syntax in addition to N3, N-ary paterns)
Moving Pychinko to the Nokia. On the phoen, resource are limited, and the hardwarelimitations prevent one from using heavy-duty reasoners such as those writeen in Java. Can use Pychinko to process rules on the phone.
The existing code is limited to at most a few thousand rules. For larger data sets there needs to be an extension to support secondary storage.
Questions
Q. so what do you do in the face of conflicting data
A. the rules are purely symbolic, we have no way of saying they conflict semantically. our rules systems are not able to handle true conflicts
Q. what does a rule look like in Pychinko
A. he pointed to one of the earlier slides showing that rules are queries
Q. what is the complexity of the engine
A. linearly
In case anyone is interested... since it looks like it didn't make any of the slides... PyChinko uses rdflib... see http://rdflib.net/
--------------------------------------------------------------------------
REFERENCES:
{as documents / sites are referenced add them below}
http://www.mindswap.org/~katz/pychinko/
--------------------------------------------------------------------------
QUOTES:
--------------------------------------------------------------------------
CONTRIBUTORS:
{add your name, e-mail address and URL below}
Ted Leung <twl@sauria.com> <http://www.sauria.com/blog>
--------------------------------------------------------------------------
E-MAIL BOUNCEBACK:
{add your e-mail address separated by commas }
--------------------------------------------------------------------------
NOTES ON / KEY TO THIS TEMPLATE:
A headline (like a field in a database) will be CAPITALISED
This differentiates from the text that follows
A variable that you can change will be surrounded by
_underscores_
Spaces in variables are also replaced with under_scores
This allows people to select the whole variable with a simple double-click
A tool-tip is lower case and surrounded by
{curly brackets / parentheses}
These supply helpful contextual information.
--------------------------------------------------------------------------
Copyright shared between all the participants unless otherwise stated...