Saturday, July 8, 2017

anatomy of the "file per expression" data format in the Physics Derivation Graph

Each derivation gets a folder. The folder typically contains four CSV files:

  • expression_identifiers.csv: two columns of integers (7 digits, 10 digits) where each row looks like "1432042,1029039904"
  • feeds.csv: a single column of integers (7 digits) where each row looks like "2342425" 
  • inference_rule_identifiers.csv: two columns, the first is an integer (7 digits). An example row looks like 1294844,declareInitialExpression
  • derivation_edge_list.csv: two columns of integers (7 digits each) where each row looks like "3934948,3499522"
Elsewhere in this documentation I refer to the 7 digit integers as "temporary IDs" and the digit integers as "permanent IDs". 

The graph visualization is built from the content of derivation_edge_list.csv, but additional decorations are need to renders a meaningful picture. That's where the other three CSVs come into play -- they indicate what the relevant decoration is (either a feed, expression, or inference rule). 


No comments:

Post a Comment