Tuesday, July 28, 2020

barriers to implementing formalization in mathematics

Inspired by reading the QED manifesto and responses like

I attended a conference on formalization of mathematics. The challenges to progress appear to include the following
  • The diversity of topics (both depth and breadth) is huge. 
  • The variable level of formalization (from natural language to Latex to Constrained Natural Languages to Computer Algebra Systems to Proof assistants. 
  • Addressing all the requirements (readability, conciseness, preciseness) is infeasible with current techniques, so selecting any potential solution is suboptimal. Investing resources (time, attention, compute, creativity) in a suboptimal solution is an opportunity cost.
  • There is more brain power than staffing. Good ideas don't get implemented because hiring software programmers isn't accessible
  • The staffing that is available is graduate students. These are neither domain experts nor experienced programmers
  • The degree of automation for projects varies. What is considered hard or scalable depends on who is doing the work. 
  • Design decisions are driven by the specific problems being solved. A "universal" and comprehensive approach incurs inefficiency in any given domain. 
  • The degree of software development best practices varies (linting, continuous integration/continuous deployment, use of version control)
  • The diversity of implementation choices (Python, Haskell, Coq, C, C++)
  • Inventing novel domain specific languages is easy for this community, further mudding the options
  • Standard (basic) math is not exciting to build a product around
  • Maintenance of software and databases is not exciting research
  • There are a variety of databases which hold proofs: 
  • Anyone can post code (e.g., github); also there is no requirement to post source code
    • There is no standardization on what level of quality is acceptable
    • There is no review process
    • Linking of code to papers to presentations is unusual
  • Licensing of code and results varies -- open source, closed source, unspecified
  • Who the audience should be is unclear -- advancing research or sharing with advanced peers or on-boarding graduate students or  teaching undergraduates?
  • Specific goal (beyond QED manifesto) is not agreed upon; there are no milestones.
    • use cases are undefined
    • well-defined challenge thresholds are needed
  • A survey and detailed comparison of various options is not readily available?
  • Translation between representations is usually lossy since scopes are not one-to-one
  • There is no authority, formal or informal. There is no designated leader
  • There is no governance structure. Decision making is ad hoc and social
  • Incentives for participants include
    • Publishing papers
    • Advancing knowledge
    • Building reputation
    • Forming alliances

As a consequence of the above issues, the selection function becomes who can produce the most creative idea and get it implemented in order to compete for attention.

Top-down direction of effort isn't required, but the consequence of bottom-up coordination is that there duplicated effort and investment in translation between faction required.



The above list gets oversimplified into observations like "no barrier exists other than someone sitting down and doing the work."
There are a couple implicit assumptions there:

  • the ill-defined approach in my mind is the best and right method
  • that someone is not me. Probably a graduate student since the work is trivial

Speed does not seem to be an issue. There is not a compute bottleneck. 

No comments:

Post a Comment