Wednesday, April 8, 2020

a terrible hack to get JSON into a database

I've been using JSON to store Physics Derivation Graph content. The motive is that JSON is capable of storing data in a way that most closely reflects how I think of the data structure in Python (nested dictionaries and lists).

To support multiple concurrent users, JSON doesn't work. The multiple users with concurrent writes would require locks to ensure changes are not lost.
Migrating from JSON to a table-based data structure (e.g., MySQL, PostGRESQL, SQLite) incurs a significant rewrite. Another option would be to use Redis, specifically the ReJSON plugin that alters the flat hashes in Redis to a nested structure closer to JSON.

I'm wary of using a plugin for data storge, and I'm reluctant to rewrite the PDG as tables.
There is a terrible hack that allows me to stick with JSON while also resolving the concurrency issue that doesn't require a significant rewrite: I could serialize the JSON and store it in Redis as a very long string.

Redis has a maximum string length of 512 MB (!) according to
https://redis.io/topics/data-types

What I'm currently doing:
>>> import json
>>> path_to_db = 'data.json'
>>> with open(path_to_db) as json_file:
     dat = json.load(json_file)

Terrible hack:

Read the content as text, then save to redis
>>> with open(path_to_db) as jfil:
    jcontent = jfil.read()
>>> rd.set(name='data.json', value=jcontent)
True

which can be simplified to

>>> with open(path_to_db) as jfil:
    rd.set(name='data.json', value=jfil.read())

Then, to read the file back in, use

>>> file_content = rd.get('data.json')
>>> dat = json.loads(file_content)

No comments:

Post a Comment