Sunday, December 27, 2020

ordered list representation in RDF

The Physics Derivation Graph depends on a data structure capable of using ordered lists. RDF's support for ordered lists is slightly convoluted. The best visualization of ordered lists in RDF I've found is https://ontola.io/blog/ordered-data-in-rdf/

I tried sketching how the "linked recursive lists" approach looks for the Physics Derivation Graph for a derivation that has a sequence of steps, and each step has an ordered list of inputs, feeds, and outputs.



Credit: dreampuf.github.io

Sunday, December 13, 2020

identifying classes in the Physics Derivation Graph for OWL (Web Ontology Language)

Classes and subclasses of entities in the Physics Derivation Graph:

  • derivations = an ordered set of two or more steps
  • steps = a set of one or more statements related by an inference rule
  • inference rule = identifies the relation of a set of one or more statements
  • statement = two or more expressions (LHS and RHS) and a relational operator
    • expressions = an ordered set of symbols
    • symbols = a token
      • operator = applies to one or more values (aka operands). Property: number of expected values
      • value. Property: categorized as "variable" xor "constant"
        • integer = one or more digits. The set of digits depends on the base
        • float
        • complex
      • unit. Examples: "m" for meter, "kg" for kilogram
Some aspects of expressions and derivations I don't have names for yet:
  • binary operators {"where", "for all", "when", "for"} used two relate two expressions, the "primary expression" on the left and one or more "scope"/"definition"/"constraint" (equation/inequality)

Some aspects of expressions and derivations I don't need to label in the PDG:
  • terms = parts of the expression that are connected with addition and subtraction
  • factors = parts of the expression that are connected by multiplication
  • coefficients = a number that is multiplied by a variable in a mathematical expression.
  • power, base, exponent
  • base (as in decimal vs hexadecimal, etc)
  • formula
  • function

An equation is two expressions linked with an equal sign. 
What is the superclass above "equation" and "inequality"?
So far I'm settling on "statement".

I am intentionally staying out of the realm of {proofs, theorems, axioms} both because that is outside the scope of the Physics Derivation Graph and because the topic is already addressed by OMDoc. 

Suppose we have a statement like
y = x^2 + b where x = {5, 3, 1}
In that statement, 
  • "y = x^2 + b" is an equation
  • "x^2 + b" is an expression and is related to the expression "y" by equality. 
  • "x^2" is a term in the RHS expression
  • "x = {5, 3, 1}" is an equation that provides scope for the primary equation. 
What is the "where" relation in the statement? The "where" is a binary operator that relates two equations. There are other "statement operators" to relate equations, like "for all"; see the statement
a + c = 2*g + k for all g \in \Re
In that statement, "g \in \Re" is (an equation?) serving as a scope for the primary equation. 

All statements have supplemental scope/definition equations that are usually left as implicit. The reader is expected to deduce the scope of the statement from the surrounding context. 

The supplemental scope/definition equations describe both per-variable and inter-variable constraints. For example,
x*y + 3 = 94 where ((x \in \Re) AND (y \in \Re) AND (x<y))

More complicated statement:
f(x) = { 0 for x<0
       { 1 for 0<=x<=1
       { 0 for x>1
Here the LHS is a function and the RHS is an integer, but the value of the integer depends on x. 
Note that the "0<=x<=1" can be separated into "0<=x AND x<=1". Expanding this even more,
(f(x) = 0 for x<0) AND (f(x) = 1 for (0<=x AND x<=1)) AND (f(x) = 0 for x>1)

Saturday, December 12, 2020

an argument in support of RDF instead of property graphs

I've wrestled with whether to use Property Graphs to store and query the Physics Derivation Graph. I see potential value, but the licensing of Neo4j keeps me from committing. I'm aware of other implementations, but I don't have confidence about either their stability or durability.

This post makes a convincing argument about both the short-comings of a property-graph-based knowledge graph and the value of an RDF-based storage method. To summarize,

  • don't be distracted by visualization capabilities; inference is more important
  • property graph IDs are local, whereas identifiers in RDF are global. 
  • Global IDs are vital for enabling federation, merge, diff

I know OWL (Web Ontology Language) is popular for knowledge representation, and this post was the first to provide a clear breakdown of the difference between property graphs, RDF, and OWL. OWL supports

  • the ability infer that a node that is a member of a class is also a member of any of its superclasses
  • properties can have superproperties
OWL overview:
  • https://www.cambridgesemantics.com/blog/semantic-university/learn-rdf/
  • https://www.cambridgesemantics.com/blog/semantic-university/learn-owl-rdfs/owl-101/
  • https://www.cambridgesemantics.com/blog/semantic-university/learn-owl-rdfs/