Sunday, September 20, 2020

use the inputs and inference rule to generate the output

Instead of expecting the user to provide the inputs and outputs and inference rule, supplying the inputs and inference rule is sufficient to generate the output. This output is necessarily consistent with the inputs and inference rule.

>>> from sympy import *

Define an inference rule

def mult_both_sides_by(expr, feed):
    return Equality(expr.lhs*feed, expr.rhs*feed, evaluate=False)
 
>>> expr = parse_latex('a = b')
>>> feed = parse_latex('f')
>>> mult_both_sides_by(expr, feed)
Eq(a*f, b*f)

This generalizes to include the relation

def mult_both_sides_by(expr, feed, relation):
    return relation(expr.lhs*feed, expr.rhs*feed, evaluate=False)
 
>>> mult_both_sides_by(expr, feed, Equality)
Eq(a*f, b*f)

Other relations are available; see https://docs.sympy.org/latest/modules/core.html
>>> mult_both_sides_by(expr, feed, Le)
a*f <= b*f

text to Latex to SymPy using frequency and period example

As an illustration of the gradations from text to Latex to CAS is provided below. In the derivation the CAS is 1-to-1 with the Latex.



statement

Frequency and period are inversely related.


statement with mathematical notation

Frequency and period are inversely related; thus T = 1/f and f = 1/T

statement with mathematical notation and explanation of derivation

Frequency and period are inversely related; thus T = 1/f
Multiple both sides by f, then divide by T to get f = 1/T.
statement with explanation of derivation, separating expressions from text

Frequency and period are inversely related; thus 
T = 1/f.
Multiple both sides by f to get
f T=1
then divide by T to get
f = 1/T.

statement with expressions separated from text and with bindings between math and text made explicit

Frequency and period are inversely related; thus 
expression 1: T = 1/f
Multiple both sides of expression 1 by f to get expression 2
expression 2: f T=1
then divide both sides of expression 2 by T to get expression 3
expression 3: f = 1/T.

statement with inference rules made explicit

claim: Frequency and period are inversely related; thus
inference rule: declare initial expression
expression 1: T = 1/f
inference rule: Multiple both sides of expression 1 by f to get expression 2
expression 2: f T=1
then 
inference rule: divide both sides of expression 2 by T to get expression 3
expression 3: f = 1/T.
inference rule: declare final expression


use of a Computer algebra system to implement inference rules

The following expansion requires

  • conversion of Latex to SymPy
  • correctly implemented inference rules

>>> import sympy
>>> from sympy import *
>>> from sympy.parsing.latex import parse_latex

claim: Frequency and period are inversely related; thus
inference rule: declare initial expression
expression 1: T = 1/f

To confirm consistency of representations, the input Latex expression can be converted to SymPy and then back to Latex using

>>> latex(eval(sympy.srepr(parse_latex('T = 1/f'))))
'T = \\frac{1}{f}'

We'll work with the SymPy representation of expression 1,

>>> sympy.srepr(parse_latex('T = 1/f'))
"Equality(Symbol('T'), Pow(Symbol('f'), Integer(-1)))"

Rather than using the SymPy, use the raw format of expression 1

>>> expr1 = parse_latex('T = 1/f')

inference rule: Multiple both sides of expression 1 by f to get expression 2
expression 2: f T=1

Although we can multiply a variable and an expression,

>>> expr1*Symbol('f')
f*(Eq(T, 1/f))

what actually needs to happen is first split the expression, then apply the multiplication to both sides

>>> Equality(expr1.lhs*Symbol('f'), expr1.rhs*Symbol('f'))
Eq(T*f, 1)

Application of an inference rule (above) results in the desired result, so save that result as the second expression (below).

>>> expr2 = Equality(expr1.lhs*Symbol('f'), expr1.rhs*Symbol('f'))

inference rule: divide both sides of expression 2 by T to get expression 3
expression 3: f = 1/T.

>>> Equality(expr2.lhs/Symbol('T'), expr2.rhs/Symbol('T'))
Eq(f, 1/T)

Again, save that to a variable

>>> expr3 = Equality(expr2.lhs/Symbol('T'), expr2.rhs/Symbol('T'))

>>> latex(expr3)
'f = \\frac{1}{T}'

inference rule: declare final expression


statement with inference rules and numeric IDs for symbols

To relate the above derivation to any other content in the Physics Derivation Graph, replace T and f with numeric IDs unique to "period" and "frequency"

>>> import sympy
>>> from sympy import *
>>> from sympy.parsing.latex import parse_latex

claim: Frequency and period are inversely related; thus
inference rule: declare initial expression
expression 1T = 1/f

>>> expr1 = parse_latex('T = 1/f')
>>> eval(srepr(expr1).replace('T','pdg9491').replace('f','pdg4201'))
Eq(pdg9491, 1/pdg4201)

Save the result as expression 1
>>> expr1 = eval(srepr(expr1).replace('T','pdg9491').replace('f','pdg4201'))

inference rule: Multiple both sides of expression 1 by f to get expression 2
expression 2f T=1

>>> feed = Symbol('f')
>>> feed = eval(srepr(feed).replace('f','pdg4201'))
>>> Equality(expr1.lhs*feed, expr1.rhs*feed)
>>> Equality(expr1.lhs*feed, expr1.rhs*feed)
Eq(pdg4201*pdg9491, 1)
>>> expr2 = Equality(expr1.lhs*feed, expr1.rhs*feed)

inference rule: divide both sides of expression 2 by T to get expression 3
expression 3f = 1/T.

>>> feed = Symbol('T')
>>> feed = eval(srepr(feed).replace('T','pdg9491'))
>>> Equality(expr2.lhs/feed, expr2.rhs/feed)
Eq(pdg4201, 1/pdg9491)
>>> expr3 = Equality(expr2.lhs/feed, expr2.rhs/feed)

Convert from numeric ID back to Latex symbols in Latex expression
>>> latex(eval(srepr(expr3).replace('pdg9491','T').replace('pdg4201','f')))
'f = \\frac{1}{T}'

inference rule: declare final expression

removal of text, pure Python

The above steps can be expressed as a Python script with two functions (one for each inference rule)

from sympy import *
from sympy.parsing.latex import parse_latex

# assumptions: the inference rules are correct, the conversion of symbols-to-IDs is correct, the Latex-to-SymPy parsing is correct

def mult_both_sides_by(expr, feed):
    return Equality(expr.lhs*feed, expr.rhs*feed)

def divide_both_sides_by(expr, feed):
    return Equality(expr.lhs/feed, expr.rhs/feed)

# inference rule: declare initial expression
expr1 = parse_latex('T = 1/f')
expr1 = eval(srepr(expr1).replace('T','pdg9491').replace('f','pdg4201'))

feed = Symbol('f')
feed = eval(srepr(feed).replace('f','pdg4201'))
expr2 = mult_both_sides_by(expr1, feed)

feed = Symbol('T')
feed = eval(srepr(feed).replace('T','pdg9491'))
expr3 = divide_both_sides_by(expr2, feed)

latex(eval(
srepr(expr3).replace('pdg9491','T').replace('pdg4201','f')))
# inference rule: declare final expression


How would the rigor of the above be increased?

To get beyond what a CAS can verify, a "proof" would relate each of the two functions to a set of axioms. Given the two arguments (an expression, a "feed" value), is the returned value always consistent with some set of axioms?

The set of axioms chosen matters. For example, we could start with Zermelo–Fraenkel set theory

That would leave a significant gap between building up addition and subtraction and getting to calculus and differential equations. "Theorems of calculus derive from the axioms of the real, rational, integer, and natural number systems, as well as set theory." (source)


statements I believe to be true

Here is what I currently understand to be true:

  • Every Physics Derivation is comprised of discrete steps.
  • Each Step in a Physics Derivation has a single Inference rule. 
  • Every mathematical expression in a Physics Derivation can be written in Latex. 
  • There are some math expressions which cannot be written in a given CAS
    • example: definite evaluation after integration, like (x+y)|_{a}^{b} in SymPy
  • There are some derivation steps that cannot be expressed in a given CAS
  • There is not a 1-to-1 correspondence between a CAS-based graph and a Latex-based graph. 
  • There are many gradations between "text" and Latex and CAS and proof. 

The consequence of the CAS-based graph not having 1-to1 correspondence with a Latex-based graph is that the current data structure of Latex and SymPy in one graph is not suitable. 


problem identification for vectors and my current responses

I encountered a few issues with SymPy today. I'll explain what I found and what I plan to do about it.

I have an expression (1158485859) that, in Latex, is

\frac{-\hbar^2}{2m} \nabla^2 = {\cal H}

The \nabla^2 is represented in SymPy as Laplacian(), though an argument is required for the operator.

My solution: leave the SymPy representation as Pow(Symbol('nabla'), Integer(2))


Also on the topic of vectors, I encountered the expression

\vec{p} \cdot \vec{F}(\vec{r}, t) = a

In order to convert that to SymPy, I'd need to specify a basis for `\vec{p}` (e.g., `from sympy.vector import CoordSys3D`). For example, 

>>> N = CoordSys3D('N')
>>> p = Symbol('p_x')*N.i + Symbol('p_y')*N.j + Symbol('p_z')*N.k
>>> F = Symbol('F_x')*N.i + Symbol('F_y')*N.j + Symbol('F_z')*N.k
>>> p.dot(F)
F_x*p_x + F_y*p_y + F_z*p_z

However, that doesn't seem to extend to functions:

>>> p.dot(Function(F)(Symbol('r'), Symbol('t')))
Traceback (most recent call last):
...
TypeError: expecting string or Symbol for name

My solution: leave the SymPy representation as incorrect, using "multiplication" instead of "dot"

vectors in SymPy and use of dot cross and the Laplacian

Converting "\vec{\psi}(r, t)" into SymPy is feasible
Function('vecpsi')(Symbol('r'), Symbol('t'))
but I can't figure out how to apply the dot product with a vector:

>>> import sympy
>>> from sympy import *
>>> from sympy.vector import CoordSys3D, Del, curl, divergence, gradient
>>> Symbol('vecp').dot( Function('vecpsi')(Symbol('r'), Symbol('t')) )
Traceback (most recent call last):
AttributeError: 'Symbol' object has no attribute 'dot'

The issue is that a vector needs to be specified in a specific dimension (e.g., 3) and have specific coefficients with respect to the basis.

>>> N = CoordSys3D('N')
>>> v1 = Symbol('a')*N.i+Symbol('b')*N.j + Symbol('c')*N.k
>>> v2 = Symbol('f')*N.i+Symbol('g')*N.j + Symbol('k')*N.k
>>> v1.dot(v2)
a*f + b*g + c*k
>>> v1.cross(v2)
(b*k - c*g)*N.i + (-a*k + c*f)*N.j + (a*g - b*f)*N.k

see https://en.wikipedia.org/wiki/Del
>>> delop = Del()
>>> delop(Symbol('a'))
0
>>> delop(v1)
(Derivative(a*N.i + b*N.j + c*N.k, N.x))*N.i + (Derivative(a*N.i + b*N.j + c*N.k, N.y))*N.j + (Derivative(a*N.i + b*N.j + c*N.k, N.z))*N.k
>>> v1
a*N.i + b*N.j + c*N.k
>>> curl(v1)
0
>>> divergence(v1)
0
>>> Laplacian(v1)
Laplacian(a*N.i + b*N.j + c*N.k)

Also, operators can't be defined since using Laplacian requires an argument:
>>> Laplacian()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __new__() missing 1 required positional argument: 'expr'

Sunday, September 6, 2020

expanding the list of Computer Algebra Systems used by the Physics Derivation Graph

The current implementation of the Physics Derivation Graph relies on user to 1) enter expressions as Latex and 2) assist/review the conversion from Latex to SymPy.

The choice of Latex as the input interface is based on ease of use (conciseness, low barrier to entry) and could be replaced by other options for specifying expressions. Entering characters and then parsing into an AST could be applied to a variety of input methods. A second choice is what to use the AST for -- currently the output is Sympy, but a variety of output representations is available (Sage, Mathematica, Maxima, etc).

In my experience with Sage, expressions in Sage are not as compact. There seems to be less support for features like units (feet, meters, seconds) -- https://ask.sagemath.org/question/49511/working-with-units/

I'm confident there are expressions in the PDG that Sage supports and SymPy does not. Expanding the CAS used by PDG means having three representations -- Latex and SymPy and Sage. The negative consequences of this expansion would include
  • increases the work needed to input the different representations, and
  • increases the work of verifying consistency between representations, and 
  • imposes extra knowledge requirements on the developer, and
  • additional dependencies
The potential benefit would be coverage for validating some steps not currently possible with SymPy. There might be value to other people who want to compare SymPy to Sage. 

summary of SymPy hurdles for the Physics Derivation Graph



Friday, September 4, 2020

Latex symbols that are operators - how to create SymPy placeholder

{\cal H} in Latex is used for the Hamiltonian in the "1D particle in a box"
https://en.wikipedia.org/wiki/Hamiltonian_(quantum_mechanics)
Needs to be translated to a SymPy operator 
https://docs.sympy.org/latest/modules/physics/quantum/piab.html

The Laplace operator \nabla^2 in Latex is 
delop = Del()
delop.dot(delop())

representing the Laplace operator (nabla in Latex) in SymPy using Del

https://en.wikipedia.org/wiki/Laplace_operator
https://docs.sympy.org/latest/modules/vector/fields.html

>>> import sympy
>>> from sympy import *
>>> from sympy.parsing.latex import parse_latex
>>> from sympy.vector import Del


Original expression as Latex converted to SymPy to Latex:
>>> latex(eval(sympy.srepr(parse_latex("\\frac{-\\hbar^2}{2m} \\nabla = {\\calH}"))))
'- \\frac{\\hbar^{2} nabla}{2 m} = calH'

The first two conversions yield SymPy:
>>> sympy.srepr(parse_latex("\\frac{-\\hbar^2}{2m} \\nabla = {\\calH}"))
"Equality(Mul(Symbol('nabla'), Mul(Mul(Integer(-1), Pow(Symbol('hbar'), Integer(2))), Pow(Mul(Integer(2), Symbol('m')), Integer(-1)))), Symbol('calH'))"

This can be successfully evaluated as SymPy because the 'nabla' is a Symbol
>>> eval("Equality(Mul(Symbol('nabla'), Mul(Mul(Integer(-1), Pow(Symbol('hbar'), Integer(2))), Pow(Mul(Integer(2), Symbol('m')), Integer(-1)))), Symbol('calH'))")
Eq(-hbar**2*nabla/(2*m), calH)

However, replacing 'nabla' with 'Del' causes the eval to fail:
>>> eval("Equality(Mul(Del, Mul(Mul(Integer(-1), Pow(Symbol('hbar'), Integer(2))), Pow(Mul(Integer(2), Symbol('m')), Integer(-1)))), Symbol('calH'))")
Traceback (most recent call last):
...
  File "/usr/local/lib/python3.6/dist-packages/sympy/core/mul.py", line 307, in flatten
    b, e = o.as_base_exp()
AttributeError: type object 'Del' has no attribute 'as_base_exp'



Original expression to convert to SymPy:
>>> sympy.srepr(parse_latex("\\nabla^2 \\psi \\left( \\vec{r},t) \\right) = \\frac{i}{\\hbar} \\vec{p} \\cdot \\left( \\vec{ \\nabla} \\psi( \\vec{r},t) \\right)"))
"Mul(Pow(Symbol('nabla'), Integer(2)), Mul(Symbol('psi'), Mul(Symbol('right'), Function('left')(Mul(Symbol('r'), Symbol('vec')), Symbol('t')))))"


The Latex "\nabla" is SymPy's "Del". However, squaring Del isn't available
>>> sympy.latex(sympy.Pow(Del, sympy.Integer(2)), sympy.Symbol('x'))
...
AttributeError: type object 'Del' has no attribute '_eval_power'


The Laplacian operator is the dot product of two Del operators, so
>>> delop = Del()
>>> sympy.latex(delop.dot(delop(sympy.Function('\psi')(sympy.Symbol('r'), sympy.Symbol('t')))))
'0'


Evaluating definite integrals for humans versus SymPy breaks the Latex-to-SymPy mapping of steps

 In the process of fixing expressions and steps in the PDG database, I encountered a novel challenge.

Currently every step in the PDG has a set of Latex expressions. These expressions are provided by the user, converted to SymPy, and then the step is validated using SymPy. There is a one-to-one mapping of "what the user sees" to "what the CAS checks."

In a derivation there is a sequence like

    f = \int_a^b x dx

inference rule: carry out definite integration

    f = (x^2/2)|_a^b

inference rule: simplify

    f = 1/2 (b^2 - a^2)

It seems that SymPy doesn't support the middle expression and instead goes directly from the first to the last expression. That means there is an intermediary Latex expression that cannot be converted to Sympy, breaking the assumption of "what the human reads is one-to-one with what the CAS checks."

Accounting for the mismatch between Latex steps and CAS steps would make the already-messy graph structure more complicated. Some steps that are included for human readability would not be able to be specified to the CAS, nor would those steps be checked.

ufw allow status numbered delete -- managing the firewall

$ sudo ufw status verbose
Status: active
Logging: on (low)
Default: deny (incoming), allow (outgoing), deny (routed)
New profiles: skip

To                         Action      From
--                         ------      ----
22/tcp                     LIMIT IN    Anywhere                  
443/tcp                    ALLOW IN    Anywhere                  
80/tcp                     ALLOW IN    Anywhere                  
25                         ALLOW IN    Anywhere                  
22/tcp (v6)                LIMIT IN    Anywhere (v6)             
443/tcp (v6)               ALLOW IN    Anywhere (v6)             
80/tcp (v6)                ALLOW IN    Anywhere (v6)             
25 (v6)                    ALLOW IN    Anywhere (v6)             


$ sudo ufw allow 8000/tcp
Rule added
Rule added (v6)


$ sudo  ufw status numbered
Status: active

     To                         Action      From
     --                         ------      ----
[ 1] 22/tcp                     LIMIT IN    Anywhere     
[ 2] 443/tcp                    ALLOW IN    Anywhere  
[ 3] 80/tcp                     ALLOW IN    Anywhere  
[ 4] 22/tcp (v6)                LIMIT IN    Anywhere (v6) 
[ 5] 443/tcp (v6)               ALLOW IN    Anywhere (v6) 
[ 6] 80/tcp (v6)                ALLOW IN    Anywhere (v6) 
[ 7] 25 (v6)                    ALLOW IN    Anywhere (v6) 

$ sudo ufw delete 7
Deleting:
 allow 25
Proceed with operation (y|n)? y

Rule deleted (v6)