Monday, September 28, 2015

jupyter installation

Installation of Jupyter ( https://jupyter.readthedocs.org/en/latest/install.html ) and
https://github.com/ipython/ipyparallel

sudo pip install jupyter

sudo pip install ipyparallel

jupyter notebook --generate-config
vi ~/.jupyter/jupyter_notebook_config.py 
       c.NotebookApp.server_extensions.append('ipyparallel.nbextension')

Sunday, September 27, 2015

autocomplete and variations


Suppose I'm given a list of words, {dog, cat, tv, boing}. When the user enters a new word, I want to let them know that there are exact or similar matches which already exist.

This will happen in a sequence of steps. From simplest to more complex,

phase 1: exact match the beginning of the word.
If I type "d", I am provided "dog"
If I type "o", I have no prompts
If I type "k", I have no prompts

phase 2: exact match anywhere in the word
If I type "d", I am provided "dog"
If I type "o", I am provided "dog" and "boing"
If I type "k", I have no prompts

phase 3: similar match anywhere in the word. Return exact matches and top X% of similar, ranked by similarity
If I type "d", I am provided "dog"
If I type "o", I am provided "dog" and "boing"
If I type "og", I am provided "dog" (exact) and "boing" (similar)
If I type "k", I have no prompts


http://stackoverflow.com/questions/7821661/how-to-code-autocompletion-in-python
https://pymotw.com/2/readline/

I'm not looking for tab completion. Instead, as I type I want the top n matches to automatically refresh


Saturday, September 19, 2015

Neo4j on Ubuntu installation notes

fresh install of Ubuntu 14.04 desktop amd64

sudo apt-get install default-jre


http://py2neo.org/2.0/

sudo pip install py2neo

property graph representation for expressions

There are many ways to represent an expression, ie a*x^2+b*x+c=d
Representing algebraic expressions is a good place to start, but for the Physics Derivation Graph I care about covering the full scope of Physics - calculus, derivatives, linear algebra, tensors, Dirac notation, Einstein notation, etc. See
https://github.com/allofphysicsgraph/proofofconcept/issues/7

Mathematica has a "tree form" for expressions which yields an Abstract Syntax Tree. I think the AST is incomplete, and would be more accurate as a property tree.

As an example, suppose we want to expand
TreeForm[  (a*x^2)+(b*x)+c=d ]

The nodes are a,b,c,d,x,2,=,+,^,*
The (directed) edges are

  • x-->power
  • 2-->power
  • a-->times
  • power-->times
  • times-->plus
  • b-->times
  • x-->times
  • times-->plus
  • c-->plus
  • plus-->equal
  • d-->equal

However, this graph representation is incomplete. First, the nodes are not all of the same type:
CREATE ( x:variable )
CREATE ( a:constant )
CREATE ( b:constant )

CREATE ( c:constant )
CREATE ( d:constant )
CREATE ( 2:integer )
CREATE ( =:relation )
CREATE ( ^:operator { name:"power" } )
CREATE ( +:operator { name:"plus" } )

Then edges are created, similar to above
CREATE ( x )-->( power )
CREATE ( 2 )-->( power )
CREATE ( a )-->( times )
...

However, the bulleted list of edges has collisions -- there are two instances of "times-->plus", but they are not supposed to be the same. Thus, the edges require unique IDs to deal with collisions.
Similarly, the nodes in the graph feature collisions. The expression refers to a single "x", but the TreeForm representation has multiple separate "x" nodes.

CREATE ( 5938:variable { symbol:"x" } )
CREATE ( 5782:constant { symbol:"a" } )
CREATE ( 4525:constant
 { symbol:"b" } )
CREATE ( :constant { symbol:"c" } )
CREATE ( :constant { symbol:"d" } )
CREATE ( :integer { symbol:"2" } )
CREATE ( :relation { symbol:"=" } )
CREATE ( :operator { name:"power", symbol:"^" } )
CREATE ( :operator { name:"plus", symbol:"+" } )

converting the Physics Derivation Graph from CSV to a property graph

I started the Physics Derivation Graph as a plain-text CSV set of files. I then converted the content to XML files. The current instance of the Physics Derivation Graph is back to a set of CSVs. CSV is attractive for its universal parsability and human readability.

Recently I learned about property graphs. Expanding on the node/edge idea of graphs, a property graph adds types and properties:

Nodes have
  • node id
  • node type
  • node property (key-value pair)
Edges have
  • edge id
  • edge type
  • edge property (key-value pair)
Using syntax from Neo4j for a simple example,

CREATE ( 149832:Expression { latex:"k=m j" } )
CREATE ( 119831:Expression { latex:"k/j=m" } )
CREATE ( 149832 )-[:DIVIDEBOTHSIDESBY { feed_1:"j" } ]->( 119831 )

Each expression is a node, and inference rules are directed edges. 

Complications in translating: 
  • within each derivation, each expression has a local ID for latex labels. 
  • each expression belongs to one or more derivations
  • each expression has a different representation in various CASs
  • Latex uses a double quote ("), so parsing may break. Reference pictures instead?
CREATE ( 149832:Expression { picture: "149832.png", sympy:"k=m*j", local_tex_id:"59589", in_derivations:["maxwell","funky"] } )
CREATE ( 119831:Expression { picture: "119831.png", sympy:"k/j=m", local_tex_id:"58584", in_derivations:"maxwell" } )
CREATE ( 149832 )-[:DIVIDEBOTHSIDESBY { feed_1_picture: "4958.png" } ]->( 119831 )

Although property graphs could have an equivalent amount of information compared to the current CSV format, the complexity is not less than the current CSV method. Also, only Neo4j supports this context. Thus, I'm not currently motivated to switch to property graph representation.

Sunday, September 13, 2015

linear storytelling constrains how one thinks

Textbooks are linear. Knowledge is not.

Describing content in graph form makes storytelling more complicated, but is a closer knowledge represenation

major fields for the PDG to cover

Update 20190728: this content has been copied to this page

Step 1: identify major fields in Physics
  • Electromagnetism
  • relativity
    • astrophysics
  • quantum mechanics
  • classical mechanics
    • thermodynamics
    • astrophysics
Step 2: identify top derivations associated with each area
  • EM: Maxwell's equations
  • Relativity: Lorentz (time dilation, length contraction)
  • Quantum: Schrodinger, Uncertainty
  • Classical: F=ma, conservation of energy and momentum

Step 3: 

switching between different Computer Algebra Systems

Although the PDG can be done in a CAS (ie Mathematica), it won't translate to other CASes without having associated Godel indices.

Symbols can be composed of other symbols. Example: $E$ and $\vec{E}$, $\partial$ and $\frac{\partial}{\partial t}$

Symbols can be operators, ie $\frac{\partial}{\partial t}$

Operators, ie $+$, can act on symbols


derivations versus identities

A derivation in physics is distinct from a mathematical identity

The PDG is hierarchical

Physics derivation graph is hierarchical:
  • Symbol, operators 
  • Expressions, inference rules 
  • Step 
  • Derivation
  • PDG
This can probably be used for visualization, determining storage format

The PDG is a directed hypergraph.

Streamlining the current PDG process

[output] = inference_rule(input)

[output_1, output_2] = inference_rule(input_1, input_2, feed)



  1. select an inference rule from list
  2. correct number of inputs, outputs, feeds is prompted for user
  3. inputs can be selected from all previous expressions
  4. database is populated with required information