Physics Derivation Graph: September 2015

Monday, September 28, 2015

jupyter installation

Installation of Jupyter ( https://jupyter.readthedocs.org/en/latest/install.html ) and
https://github.com/ipython/ipyparallel

sudo pip install jupyter

sudo pip install ipyparallel

jupyter notebook --generate-config

vi ~/.jupyter/jupyter_notebook_config.py

c.NotebookApp.server_extensions.append('ipyparallel.nbextension')

Sunday, September 27, 2015

Suppose I'm given a list of words, {dog, cat, tv, boing}. When the user enters a new word, I want to let them know that there are exact or similar matches which already exist.

This will happen in a sequence of steps. From simplest to more complex,

phase 1: exact match the beginning of the word.
If I type "d", I am provided "dog"
If I type "o", I have no prompts
If I type "k", I have no prompts

phase 2: exact match anywhere in the word
If I type "d", I am provided "dog"
If I type "o", I am provided "dog" and "boing"
If I type "k", I have no prompts

phase 3: similar match anywhere in the word. Return exact matches and top X% of similar, ranked by similarity
If I type "d", I am provided "dog"

If I type "o", I am provided "dog" and "boing"

If I type "og", I am provided "dog" (exact) and "boing" (similar)

If I type "k", I have no prompts

http://stackoverflow.com/questions/7821661/how-to-code-autocompletion-in-python
https://pymotw.com/2/readline/

I'm not looking for tab completion. Instead, as I type I want the top n matches to automatically refresh

Saturday, September 19, 2015

Neo4j on Ubuntu installation notes

fresh install of Ubuntu 14.04 desktop amd64

sudo apt-get install default-jre

http://py2neo.org/2.0/

sudo pip install py2neo

property graph representation for expressions

There are many ways to represent an expression, ie a*x^2+b*x+c=d
Representing algebraic expressions is a good place to start, but for the Physics Derivation Graph I care about covering the full scope of Physics - calculus, derivatives, linear algebra, tensors, Dirac notation, Einstein notation, etc. See
https://github.com/allofphysicsgraph/proofofconcept/issues/7

Mathematica has a "tree form" for expressions which yields an Abstract Syntax Tree. I think the AST is incomplete, and would be more accurate as a property tree.

As an example, suppose we want to expand
TreeForm[ (a*x^2)+(b*x)+c=d ]

The nodes are a,b,c,d,x,2,=,+,^,*
The (directed) edges are

x-->power
2-->power
a-->times
power-->times
times-->plus
b-->times
x-->times
times-->plus
c-->plus
plus-->equal
d-->equal

However, this graph representation is incomplete. First, the nodes are not all of the same type:
CREATE ( x:variable )
CREATE ( a:constant )
CREATE ( b:constant )
CREATE ( c:constant )
CREATE ( d:constant )
CREATE ( 2:integer )
CREATE ( =:relation )
CREATE ( ^:operator { name:"power" } )
CREATE ( +:operator { name:"plus" } )

Then edges are created, similar to above
CREATE ( x )-->( power )
CREATE ( 2 )-->( power )
CREATE ( a )-->( times )
...

However, the bulleted list of edges has collisions -- there are two instances of "times-->plus", but they are not supposed to be the same. Thus, the edges require unique IDs to deal with collisions.
Similarly, the nodes in the graph feature collisions. The expression refers to a single "x", but the TreeForm representation has multiple separate "x" nodes.

CREATE ( 5938:variable { symbol:"x" } )
CREATE ( 5782:constant { symbol:"a" } )
CREATE ( 4525:constant { symbol:"b" } )
CREATE ( :constant { symbol:"c" } )
CREATE ( :constant { symbol:"d" } )
CREATE ( :integer { symbol:"2" } )
CREATE ( :relation { symbol:"=" } )
CREATE ( :operator { name:"power", symbol:"^" } )
CREATE ( :operator { name:"plus", symbol:"+" } )

converting the Physics Derivation Graph from CSV to a property graph

I started the Physics Derivation Graph as a plain-text CSV set of files. I then converted the content to XML files. The current instance of the Physics Derivation Graph is back to a set of CSVs. CSV is attractive for its universal parsability and human readability.

Recently I learned about property graphs. Expanding on the node/edge idea of graphs, a property graph adds types and properties:

Nodes have

node id
node type
node property (key-value pair)

Edges have

edge id
edge type
edge property (key-value pair)

Using syntax from Neo4j for a simple example,

CREATE ( 149832:Expression { latex:"k=m j" } )

CREATE ( 119831:Expression { latex:"k/j=m" } )

CREATE ( 149832 )-[:DIVIDEBOTHSIDESBY { feed_1:"j" } ]->( 119831 )

Each expression is a node, and inference rules are directed edges.

Complications in translating:

within each derivation, each expression has a local ID for latex labels.
each expression belongs to one or more derivations
each expression has a different representation in various CASs
Latex uses a double quote ("), so parsing may break. Reference pictures instead?

CREATE ( 149832:Expression { picture: "149832.png", sympy:"k=m*j", local_tex_id:"59589", in_derivations:["maxwell","funky"] } )

CREATE ( 119831:Expression { picture: "119831.png", sympy:"k/j=m", local_tex_id:"58584", in_derivations:"maxwell" } )

CREATE ( 149832 )-[:DIVIDEBOTHSIDESBY { feed_1_picture: "4958.png" } ]->( 119831 )

Although property graphs could have an equivalent amount of information compared to the current CSV format, the complexity is not less than the current CSV method. Also, only Neo4j supports this context. Thus, I'm not currently motivated to switch to property graph representation.

http://www.remwebdevelopment.com/blog/sql/some-basic-and-useful-cypher-queries-for-neo4j-201.html

http://peterspangler.com/?p=147

Sunday, September 13, 2015

linear storytelling constrains how one thinks

Textbooks are linear. Knowledge is not.

Describing content in graph form makes storytelling more complicated, but is a closer knowledge represenation

major fields for the PDG to cover

Update 20190728: this content has been copied to this page

Step 1: identify major fields in Physics

Electromagnetism
relativity

astrophysics

quantum mechanics
classical mechanics

thermodynamics
astrophysics

Step 2: identify top derivations associated with each area

EM: Maxwell's equations
Relativity: Lorentz (time dilation, length contraction)
Quantum: Schrodinger, Uncertainty
Classical: F=ma, conservation of energy and momentum

Step 3:

switching between different Computer Algebra Systems

Although the PDG can be done in a CAS (ie Mathematica), it won't translate to other CASes without having associated Godel indices.

Symbols can be composed of other symbols. Example: $E$ and $\vec{E}$, $\partial$ and $\frac{\partial}{\partial t}$

Symbols can be operators, ie $\frac{\partial}{\partial t}$

Operators, ie $+$, can act on symbols

derivations versus identities

A derivation in physics is distinct from a mathematical identity

The PDG is hierarchical

Physics derivation graph is hierarchical:

Symbol, operators
Expressions, inference rules
Step
Derivation
PDG

This can probably be used for visualization, determining storage format

The PDG is a directed hypergraph.

Streamlining the current PDG process

[output] = inference_rule(input)

[output_1, output_2] = inference_rule(input_1, input_2, feed)

select an inference rule from list
correct number of inputs, outputs, feeds is prompted for user
inputs can be selected from all previous expressions
database is populated with required information