Sunday, May 16, 2021

what would create a tipping point

Every scientist coming to the website https://derivationmap.net/ is unlikely. 

Graph analysis 

This is what I've historically chased -- identifying the data structure, and the data input mechanism. 

Extracting value from staring at a visualization of the graph of equations is unlikely. I'm not clear what graph queries are relevant to run against the graph content.


Consistency within a document

The local value to both the author and the reader is in determining whether the mathematical content of the paper being read is self-consistent. Practically, that means

  • are the dimensions of variables with each expression consistent?
  • are the units used in each expression consistent?
  • are the variables clearly defined? Here "definition" means a tuple of (symbol, dimensions, definition)
    • are the variables used consistently throughout this paper?
  • are the operators well defined? Here "definition" means a tuple of (symbol, number of inputs, number of outputs, constraints per input, constraints per output). 
    • are the operators used consistently throughout this paper?
  • is the mathematical content consistent with the written text?
Some of these aspects are explored on https://derivationmap.net/clickable_layers

As an author, I want to write Latex that generates a document that is mathematically correct.

Mathematical typos should be detected (similar to spell-check). As I enter text (or as a post-processing phase), the computer should guess whether the content is math or non-math. If math, then prompt the author for relevant details. 

Options for implementation

Overleaf is open source: https://github.com/overleaf/overleaf, so modifying it could be an option.


Cross-document analysis

In a larger context, the relevant value questions include

  • how does the paper I'm currently reading (or writing) relate to other papers?
  • how does the paper I'm currently reading (or writing) build on previous work?

Rather than bibliographic citation, I care about mathematical provenance. 

The specific symbols may vary across papers, and the dimensions may vary (e.g., renormalizing the speed of light to 1), but definitions have to be shared. 

The scientific community currently resorts to bibliographic citation because that is the only provenance available, not because it is what matters or what we value.

The cross-document analysis is not feasible without semantic content. The current approach of unstructured text with few hyperlinks requires human readers. Addressing the intra-document consistency challenge might yield semantic markup that enables cross-document analysis.