I've use Neo4j for other tasks associated with knowledge representation, so I'm surprised I haven't considered property graphs for storing the PDG (there's no mention in my old notes or issues or anything meaningful besides a generic link on the wiki.)
One of the potential benefits of using a property graph over a normal graph is the labeling of edges. Currently when there are multiple input expressions or feeds to an inference rule, it's not clear which input is referenced. For example, consider "IntOverFromTo" which has the LaTeX expansion, "Integrate Eq.~\ref{eq:#4} over $#1$ from lower limit $#2$ to upper limit $#3$." There are three feeds. Without labeling which feed is which, the substitution is undetermined.
With a property graph, the inference rule would have pre-defined labeled edges, ie "lower_limit" and "upper_limit" and "integrate_wrt."
Benefits to using the property graph include
- visualization tools are more likely to exist, rather than me having to code up a d3js-based web display.
- querying and editing the graph uses standard syntax, rather than relying on me creating a Python-based CLI with pre-set abilities.
- the current data structure is a list of dictionaries in memory and a set of CSV files in directories; using Neo4j I wouldn't need to manage the data structure and could still translate back to plain text
- adding additional properties (ie LaTeX for expressions versus SymPy, comments, weblinks) would be more scalable than the current data structure and schema which is manually crafted.
- cross-platform compatibility is not lost
No comments:
Post a Comment