A few recent conversations with scientists about the Physics Derivation Graph have led me think about different queries (247, 243, 241, 240, 239, 238) that could be of value and can be extracted from the current content.
In coming up with ways to query the graph, I realized a property graph is useful for supporting the queries. That is in contrast to writing a custom query capability against my existing JSON format. I'm already embarrassed by the JSON/SQL implementation, so having specific queries of interest provided me sufficient motivation to investigate implementing a Neo4j backend.
Transitioning to a property graph (specifically Neo4j) loses the fine grain control over the mechanics of the graph. However, the trade-off is well worth the increased development speed. Having a Cypher query interface via the web GUI very powerful.
With the property graph representation, augmenting information is needed in tabular format:
- all possible inference rules, along with the CAS implementation per rule
- all variable definitions, along with dimensions, constant or variable, scope, and reference URLs
- units, along with dimensions, and reference URLs
Those three tables could be stored in an SQL database.
I'm replacing a single plaintext JSON file with two non-plaintext data formats -- SQL and Neo4j.