Friday, August 9, 2019

updated task list for August 2019: SQL, arxiv

Since posting my previous task list in May I made good progress with Docker and a Flask-based web interface. The Flask web interface progressed far enough that I need to connect the front end with the SQL database backend. Listing out next steps,
  • The SQL database needs to be connected to the Flask-based web frontend
  • The SQL database needs to be populated with content from the CSVs (expressions and inference rules and derivations in version 6)

A colleague found that Latex source for arxiv articles is available in bulk in S3 buckets. As an alternative to S3 arxiv points to a subset that's available without going through AWS: https://www.cs.cornell.edu/projects/kddcup/datasets.html
The value of having a large number of expressions in Latex is that we could use the expressions to predict what a user wants to enter, decreasing the amount of manual entry required. Also, if a derivation contains similar expressions to what exists in the arxiv content, we could investigate whether the derivation is related to the arxiv paper.