Physics Derivation Graph: a grand vision for bulk .tex analysis

Saturday, June 6, 2020

Current plan for bulk .tex analysis and math extraction

characterization and counting of .tex in arxiv
anomaly detection, trie data structures of .tex in arxiv
clean up latex to remove formatting indications (this can be handled in the grammar)
The minimal regex is based on a threshold from the trie data structure.
Work is in progress to automate regex generation.
use the regex to lex latex character stream into ASTs.
parse ASTs for math syntax (e.g., into Sympy)
check dimensionality of expressions using Sympy
use inference rules to create steps that relate math expressions
use Sympy to validate inference rule application

Physics Derivation Graph