- The SQL database needs to be connected to the Flask-based web frontend
- The SQL database needs to be populated with content from the CSVs (expressions and inference rules and derivations in version 6)
A colleague found that Latex source for arxiv articles is available in bulk in S3 buckets. As an alternative to S3 arxiv points to a subset that's available without going through AWS: https://www.cs.cornell.edu/projects/kddcup/datasets.html
The value of having a large number of expressions in Latex is that we could use the expressions to predict what a user wants to enter, decreasing the amount of manual entry required. Also, if a derivation contains similar expressions to what exists in the arxiv content, we could investigate whether the derivation is related to the arxiv paper.