Tuesday, March 24, 2026

migrating from blogger to self-hosted site

BLUF: I'm now switching to a self-hosted approach: https://allofphysics.com/blog

I've been using blogger for the Physics Derivation Graph since 2015. That's 11 years and 297 posts, with many of those in 2020 due to the pandemic.

Blogger has a lot of good features, but the downside is that traffic associated with the content does not result in readers finding the primary website.

This site (https://physicsderivationgraph.blogspot.com/) will remain in-place, but new content will only be on https://allofphysics.com/blog

Thursday, February 19, 2026

advice for writing software using LLMs

Rewritten from https://www.reddit.com/r/ClaudeCode/comments/1qxvobt/ive_used_ai_to_write_100_of_my_code_for_1_year_as/

Initial conditions are vital for trajectory

The first few thousand lines set the patterns for the project. When starting a new project, pay attention to getting the process, guidelines, and guardrails right from the start. Whenever something is being done for the first time make sure it's done clean. Those early patterns are what the agent replicates across for the rest of the project. Getting design wrong early and the whole project turns to garbage. If your codebase is clean, AI makes it cleaner and faster. If it's a mess, AI makes it messier faster.

Producing code does not imply value

The temporary dopamine hit from shipping with AI agents makes you blind. You think you're going fast, but zoom out and you actually go slower because of constant refactors from technical debt ignored early.

Project health and one-shot prompts

Measure of project health: when you want to do something, can you do it in 1 shot? If no then either the code is becoming a mess, you don't understand some part of the system well enough to craft a good prompt, or the problem is too big to tackle all at once and needs breaking down.

Skill and expertise still matter

There's a big difference between technical and non-technical people using LLMs to build production apps. Engineers who built projects before LLMs know what to watch out for and can detect when things go sideways. Non-technical people can't. Architecture, system design, security, and infra decisions will bite them later.

Design decisions are significant investments

Choosing the right framework, dependencies, or database schema, the foundation everything else is built on can't be done by giving your LLM a one-liner prompt. These decisions deserve more time than adding a feature.

LLM code is not optimized by default

LLM-generated code is not optimized for security, performance, or scalability by default. You have to explicitly ask for it and verify it yourself.

Review code changes

The LLM might use created_at as a fallback for birth_date. That won't be caught with just testing whether the feature works or not. The LLM is an actor pretending to fulfill a role.

LLMs are not databases of facts

Avoid using LLMs for facts and things that could be evaluated with a deterministic function (e.g., 1+1)

Saturday, January 31, 2026

Hetzner VPS initial configuration notes

Having just configured the Oracle cloud server, and having used DigitalOcean for years, Hetzner's interface for initial configuration is closer to DigitalOcean than Oracle cloud. And the price of $3.50/month for 2 CPUs and 4GB of RAM with 40GB of storage is mindblowing compared to DigitalOcean's 2 CPUs with 1 GB of RAM and 25GB of storage for $6/month.

Configuration and management interface is at https://console.hetzner.com/

Create user

I logged into the server using

  $ ssh -i ~/.ssh/KEYNAMEHERE root@SERVERIPHERE

and then create a new user and group

# sudo adduser pdg
# sudo adduser pdg sudo
# sudo addgroup pdg_grp
# sudo usermod -aG pdg_grp pdg

The -a (append) and -G (groups) options ensure the user is added to the new group without being removed from their other groups.

Validating,

# groups pdg
pdg : pdg sudo users pdg_grp
# id pdg
uid=1000(pdg) gid=1000(pdg) groups=1000(pdg),27(sudo),100(users),1001(pdg_grp)

SSH

Next, disable root from logging in via SSH. First, copy my public cert from my laptop to the server

$ ssh-copy-id pdg@SERVERIPHERE

On the server I exited the root SSH session and logged back in as pdg:

$ ssh -i ~/.ssh/KEYNAMEHERE pdg@SERVERIPHERE
$ sudo vi /etc/ssh/sshd_config
     PermitRootLogin no
     PubkeyAuthentication yes
     PasswordAuthentication no
     PermitEmptyPasswords no
     AuthorizedKeysFile      .ssh/authorized_keys .ssh/authorized_keys2

$ mkdir ~/.ssh
$ chmod 700 ~/.ssh
$ cat ~/.ssh/authorized_keys
$ sudo systemctl daemon-reload
$ sudo systemctl restart ssh.socket

Updating bash configuration

Added to ~/.bash_aliases

lias vi='vim'

alias s='git status'
alias p='git push'

# do not overwrite existing files
set -o noclobber

alias nothing='docker ps; git pull; git push; git status'
alias ll="ls -hal"
alias ..="cd .."
alias grin="grep -R -i -n --color"

and to ~/.bashrc

# HISTSIZE determines the number of commands remembered in memory during the current session.
# HISTFILESIZE determines the maximum number of lines allowed in the history file
export HISTSIZE=100000
export HISTFILESIZE=200000

export HISTTIMEFORMAT="%h %d %H:%M:%S "

shopt -s histappend
shopt -s cmdhist

UFW = Uncomplicated Firewall

104.210.140.141 = OpenAI, observed 2025-12-04
114.119.147.137 = PetalBot (for Huawei), observed 2025-12-04
156.59.198.136 = bytedance, observed 2025-12-04

Best practice for order of firewall rules is:

Specific DENY rules (blocking known bad actors).
Specific ALLOW rules (allowing trusted hosts/networks).
General ALLOW rules (allowing public services).
General DENY rules (the default policy, often implied).

$ sudo ufw status verbose
Status: inactive
$ sudo nft list ruleset

add rules

$ sudo ufw deny from 156.59.198.136/24
$ sudo ufw deny from 114.119.147.0/24
$ sudo ufw deny from 104.210.140.0/24
$ sudo ufw allow ssh
$ sudo ufw allow 443
$ sudo ufw allow 80

Now I have

$ sudo ufw status numbered
Status: active

     To                         Action      From
     --                         ------      ----
[ 1] Anywhere                   DENY IN     156.59.198.0/24           
[ 2] Anywhere                   DENY IN     114.119.147.0/24          
[ 3] Anywhere                   DENY IN     104.210.140.0/24          
[ 4] 22/tcp                     ALLOW IN    Anywhere                  
[ 5] 443                        ALLOW IN    Anywhere                  
[ 6] 80                         ALLOW IN    Anywhere                  
[ 7] 22/tcp (v6)                ALLOW IN    Anywhere (v6)             
[ 8] 443 (v6)                   ALLOW IN    Anywhere (v6)             
[ 9] 80 (v6)                    ALLOW IN    Anywhere (v6)

Update OS

Running Ubuntu 24.04.3 LTS, default load is

 System information as of Sat Jan 31 02:24:09 AM UTC 2026
  System load:  0.04              Processes:             123
  Usage of /:   3.0% of 37.23GB   Users logged in:       1
  Memory usage: 5%                IPv4 address for eth0: 65.21.252.29
  Swap usage:   0%                IPv6 address for eth0: 2a01:4f9:c013:5897::1

$ sudo apt update
$ sudo apt upgrade

Install Docker

Docker

$ sudo apt update
$ sudo apt install apt-transport-https curl
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
$ echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
$ sudo apt update
$ sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
$ sudo systemctl is-active docker
active

as per https://linuxiac.com/how-to-install-docker-on-ubuntu-24-04-lts/ Then add pdg to docker group

$ sudo usermod -a -G docker pdg

as per https://stackoverflow.com/questions/47854463/docker-got-permission-denied-while-trying-to-connect-to-the-docker-daemon-socke/48450294#48450294

$ sudo apt -y install make

Enable Upload to GitHub

$ ssh-keygen
$ cat id_ed25519.pub

Upload public key to my profile, https://github.com/settings/keys

Clone git repo

On the server

$ git clone git@github.com:allofphysicsgraph/ui_v8_website_flask_neo4j.git
$ cd /home/pdg/ui_v8_website_flask_neo4j

On my laptop

$ scp -i ~/.ssh/KEYNAMEHERE neo4j_pdg/plugins/apoc.jar pdg@SERVERIPHERE:/home/pdg

On the server

cd /home/pdg/ui_v8_website_flask_neo4j
echo "UID=$(id -u)" > .env
echo "GID=$(id -g)" >> .env

Build containers

$ cd /home/pdg/ui_v8_website_flask_neo4j
$ make container_build

Which results in

$ docker images
IMAGE                                ID             DISK USAGE   CONTENT SIZE   EXTRA
ui_v8_flask_webserver:latest-amd64   68046aa182c6       3.61GB          872MB

From my laptop I copy the secrets file

$ scp -i ~/.ssh/KEYNAMEHERE .env.google pdg@SERVERIPHERE:/home/pdg/ui_v8_website_flask_neo4j

On the server I have a few changes to enact,

cd /home/pdg/ui_v8_website_flask_neo4j
mkdir logs
cd certs
openssl genrsa > privkey.pem
openssl req -new -x509 -key privkey.pem > fullchain.pem
openssl dhparam -out dhparam.pem 2048

I then tried running the server using

cd /home/pdg/ui_v8_website_flask_neo4j
make launch_webserver

but encountered some permissions issues that were fixed using

cd /home/pdg/ui_v8_website_flask_neo4j
sudo chown -R pdg:pdg neo4j_pdg/
sudo chown -R pdg:pdg dumping_grounds/

I was then able to successfully run

$ make launch_webserver COMPOSE_FLAGS=--detach

The webserver is then running at SERVERIPHERE on the internet!

DNS

Log into https://account.squarespace.com/domains

On the page https://account.squarespace.com/domains/managed/allofphysics.com/dns/domain-nameservers change the Name Servers to Hetzner's:

helium.ns.hetzner.com
hydrogen.ns.hetzner.com
oxygen.ns.hetzner.com

On the Hetzner console webpage, select "DNS" and then "Add DNS zone"

Set the domain (allofphysics.com) and "Create an Empty zone." (The other options are "Import Zone file" and "Secondary".)

On DigitalOcean's DNS console, change TTL to 300 for each of the domains.

Since the nameservers are ns1.digitalocean.com, then I will need to change the records in the DigitalOcean Control Panel.

In DigitalOcean's Control Panel, set the "A" records to point to Hetzner's SERVERIPHERE and "NS" to point to Hetzner's

In Hetzner's panel, set "@" and "www" and "*" to Hetzner's SERVERIPHERE.

Certbot

In Hetzner's DNS panel, added TXT record "_acme-challenge" with TTL 300. The "value" will later be provided by certbot

Followed instructions from https://certbot.eff.org/instructions?ws=nginx&os=pip

cd /home/pdg/
sudo apt update
sudo apt upgrade
sudo apt install python3 python3-dev python3-venv libaugeas-dev gcc
sudo python3 -m venv /opt/certbot/
sudo /opt/certbot/bin/pip install --upgrade pip
sudo /opt/certbot/bin/pip install certbot certbot-nginx
sudo ln -s /opt/certbot/bin/certbot /usr/local/bin/certbot

Since my nginx isn't running baremetal, I can't use the recommended sudo certbot certonly --nginx.

Instead, I used

sudo certbot certonly --manual --preferred-challenges dns \
--server https://acme-v02.api.letsencrypt.org/directory \
-d derivationmap.net -d www.derivationmap.net \
-d allofphysics.com -d www.allofphysics.com

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Requesting a certificate for derivationmap.net and 3 more
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Please deploy a DNS TXT record under the name:

_acme-challenge.allofphysics.com.

with the following value:

ARANDOMLOOKINGSTRINGHERE

Before continuing, verify the TXT record has been deployed. Depending on the DNS
provider, this may take some time, from a few seconds to multiple minutes. You can
check if it has finished deploying with aid of online tools, such as the Google
Admin Toolbox: https://toolbox.googleapps.com/apps/dig/#TXT/_acme-challenge.allofphysics.com.
Look for one or more bolded line(s) below the line ';ANSWER'. It should show the
value(s) you've just added.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Press Enter to Continue

Successfully received certificate.
Certificate is saved at: /etc/letsencrypt/live/derivationmap.net/fullchain.pem
Key is saved at:         /etc/letsencrypt/live/derivationmap.net/privkey.pem
This certificate expires on 2026-05-01.
These files will be updated when the certificate renews.

NEXT STEPS:
- This certificate will not be renewed automatically. 
Autorenewal of --manual certificates requires the use of 
an authentication hook script (--manual-auth-hook) but one was not provided. 
To renew this certificate, repeat this same certbot command 
before the certificate's expiry date.

The Google Admin Toolbox page (https://toolbox.googleapps.com/apps/dig/#TXT/_acme-challenge.allofphysics.com) didn't pick up the TXT record, but nslookup did:

$ nslookup -q=TXT _acme-challenge.allofphysics.com hydrogen.ns.hetzner.com
Server:		hydrogen.ns.hetzner.com
Address:	213.133.100.98#53

_acme-challenge.allofphysics.com	text = "ARANDOMLOOKINGSTRINGHERE"

Now https://allofphysics.com/ looks correct and has no errors! (Same for derivationmap.net)

The renewal instructions on https://physicsderivationgraph.blogspot.com/2021/10/periodic-renewal-of-https-letsencrypt.html should be valid when these certs expire?

sudo certbot certonly --webroot \
-w /home/pdg/ui_v8_website_flask_neo4j/certs \
--server https://acme-v02.api.letsencrypt.org/directory \
-d derivationmap.net -d www.derivationmap.net \
-d allofphysics.com -d www.allofphysics.com

Wednesday, January 28, 2026

Oracle cloud configuration for Free Tier

Configuring a VCN (Virtual Cloud Network)

To have the full range of options when setting up the VPS, first create a VCN (virtual cloud network) and Create a Subnet. That way, when you are setting up the VPS (virtual private server) you can select an existing VCN and subnet.

From the user dashboard, on the right-side "Build" menu select "Set up a network with a wizard"

"Create VCN" and then use the wizard for "Create VCN with internet connectivity"

VCN name: pdg_test_vcn

Then "Reserved public IPv4 address" > Create

Configuring my First VPS

From the user dashboard, on the right-side "Build" menu select "Create compute instance" wizard,

For "placement" I selected AD2 (no reason not to go with AD1 or AD3 as far as I can tell)

Under "Advanced Options" I selected "On-demand capacity" out of the options; see https://docs.oracle.com/en-us/iaas/Content/Compute/Concepts/computeoverview.htm#capacity_types for descriptions

Under "Image and shape" the default image is "Oracle Linux 9". My Digital Ocean droplet is currently "Ubuntu 24.04.3 LTS". Oracle offers 8 version of Ubuntu, so I selected 24.04

For "Shape" I selected "virtual machine" (rather than "bare metal")

For "Shape series" I selected "Ampere" (rather than AMD or Intel) and "VM.Standard.A1.Flex" which is eligble for Free Tier.

In the Networking section I set up a Primary VNIC (virtual network interface card) which connects to a VCN (virtual cloud network). This is required for a public IP for the Internet.

Primary VNIC name: pdg_test_VNIC

Primary network: "create new virtual cloud network"

new virtual cloud network name: vcn-20260127-0933_pdg

Subnet: create new public subset; name: subnet-20260127-0933_pdg. CIDR block 10.0.0.0/24

Private

Result: Oracle doesn't have capacity in the AD

"Out of capacity for shape VM.Standard.A1.Flex in availability domain AD-1. Create the instance in a different availability domain or try again later.If you specified a fault domain, try creating the instance without specifying a fault domain. If that doesn’t work, please try again later.Learn more about host capacity."

Saturday, January 24, 2026

VPS price comparison September 2024

Netcup

https://www.netcup.com/en

German company. Has US hosting capability.

as of 2024-09-02, lowest cost relevant configuration: "VPS 250 G11s"

3.35 eur is currently $3.71 usd
2 vCore (x86)
2 GB ECC RAM
64 GB SSD
Traffic included

Unfortunately, the US-based servers are only available for beefy configurations. The "VPS 250 G11s" is hosted in Germany; see https://www.netcup.com/en/server/vps

Namecheap

https://www.namecheap.com/hosting/shared/

Is not a VPS. Provides disk space and pre-configured web services. Not suitable for running Docker.

hetzner

https://www.hetzner.com/cloud/

Servers hosted in Germany or Finland

CX22 is

2 vCPU
4GB RAM
40 GB SSD

for 4.51 eur ($5 USD) per month

AWS S3

AWS S3 is suitable for static sites only. Dynamic content would need to live in lambdas for on-demand execution.

As per https://www.reddit.com/r/aws/comments/mfbgot/is_aws_a_good_alternative_to_a_20mo_vps_for/

AWS LightSail

AWS VPS is "LightSail"

https://calculator.aws/#/createCalculator/Lightsail

The configuration "Bundle:2GB" features

Storage: 60GB
vCPU: 2
Memory: 2GB
Data Transfer Quota: 3TB

and costs $11.77/month

Google Cloud (GCP)

https://cloud.google.com/compute/vm-instance-pricing#general-purpose_machine_type_family

"e2-standard-2"

2 vCPUs
8GB RAM

for $55.08726/month in Virgina, or $48/month in Iowa. Cheaper if you sign up for 1 or 3 years.

linode

https://www.linode.com/pricing/ as of 2024-11-29 offers

1GB RAM, 25GB storage for $5/month
2GB RAM, 50GB storage for $12/month

ioflood

https://ioflood.com/bare-metal-cloud-server.php

Dual Intel Xeon E5-2695v4 CPUs
36 CPU cores
64GB RAM
960GB NVMe SSD
100TB @ 10Gbps (Inbound unmetered)

for $99/mo monthly

https://manage.ioflood.com/orderform/index.php?form=Max-95v4-September-2024

VPS price comparison January 2026

DigitalOcean

Currently (as of 2026-01-24) paying $6/month for 2 CPUs, 1GB of RAM,

Was paying $12/month for 2 CPUs, 2GB of RAM (to support Neo4j)

Hetzner

https://www.hetzner.com/cloud/

"CPX11" in US host is 2 vCPUs, 2GB of RAM, 40GB SSD for 5 euros. Doesn't include VAT

Update 2026-01-31: See https://physicsderivationgraph.blogspot.com/2026/01/hetzner-vps-initial-configuration-notes.html

Oracle

https://www.oracle.com/cloud/compute/arm/pricing/

https://www.oracle.com/cloud/free/

"AMD Compute Instance" has 1/8 OCPU and 1 GB memory

"Arm Compute Instance" has 4 Arm-based Ampere A1 cores and 24 GB of memory usable as 1 VM or up to 4 VMs Always Free 3,000 OCPU hours and 18,000 GB hours per month

I attempted to use Oracle (see https://physicsderivationgraph.blogspot.com/2026/01/oracle-cloud-configuration-for-free-tier.html) but their free tier didn't have any available instances.

Google Compute Engine

https://cloud.google.com/pricing

https://cloud.google.com/products/calculator

"E2" with 2vCPUs and 2GB of RAM is $37.11/month

https://docs.cloud.google.com/free/docs/free-cloud-features says "1 non-preemptible e2-micro VM instance per month in one of the following US regions: Oregon: us-west1. Iowa: us-central1. South Carolina: us-east1. 30 GB-months standard persistent disk. 1 GB of outbound data transfer from North America to all region destinations (excluding China and Australia) per month."

The e2-micro is 0.25 CPUs and 1GB of RAM.

Microsoft Azure

B2pts v2 = 1GB of RAM, 2 vCPUs for $6.13/month

next step up is

B2pls v2 = 4GB of RAM, 2 vCPUs for $24.5280/month

https://azure.microsoft.com/en-us/pricing/details/container-instances/

$3.8909 per GB/month

AWS Lightsail

"Bundles resources like memory, vCPU, and solid-state drive (SSD) storage into one plan, so budgeting is easy."

aka AWS simplified using pre-set configurations.

$12/month for 2 GB Memory 2 vCPUs 60 GB SSD Disk 3 TB Transfer

AWS

https://aws.amazon.com/free/
https://aws.amazon.com/free/free-tier-faqs/

2026-01-28: "Free plan eligible instances include: T3.micro, T3.small, T4g.micro, T4g.small, C7i-flex.large, M7i-flex.large" (source: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-free-tier-usage.html)

"The free plan allows you to experiment with AWS services and build proof-of-concepts at no cost for up to 6 months until you upgrade to a paid plan."

website unavailable: potentially caused by memory leak of fwupd

I have a VPS that runs Ubuntu 24.04.1 LTS (GNU/Linux 6.8.0-85-generic x86_64). The system was out-of-memory, causing the website to be unavailable. The post documents my troubleshooting and diagnosis.

When I logged into the server with the website not responding, the stats were

 System information as of Fri Jan 16 10:12:48 UTC 2026

  System load:           0.18
  Usage of /:            87.9% of 23.17GB
  Memory usage:          84%
  Swap usage:            0%
  Processes:             117
  Users logged in:       0

I used top to see what processes were causing the load. "Shift+M" sorts by memory used.

top - 10:16:10 up 103 days, 12:20,  1 user,  load average: 0.03, 0.07, 0.03
Tasks: 113 total,   1 running, 112 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.6 us,  1.9 sy,  0.0 ni, 95.1 id,  0.4 wa,  0.0 hi,  0.0 si,  0.0 st 
MiB Mem :    961.6 total,     68.4 free,    905.6 used,    127.0 buff/cache     
MiB Swap:      0.0 total,      0.0 free,      0.0 used.     56.0 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                  
1546860 username  20   0  962272 179692   1012 S   1.9  18.2 184:29.69 gunicorn                                                 
1546859 username  20   0  953968 171696   1020 S   1.9  17.4 159:19.89 gunicorn                                                 
1578333 root      20   0  689004 144056   4176 S   0.0  14.6   0:10.57 fwupd                                                    
    885 root      20   0 2001968  44176  12744 S   0.0   4.5  55:52.04 dockerd                                                  
    325 root      rt   0  288952  27136   8704 S   0.0   2.8  13:08.03 multipathd                                               
    796 root      20   0 1727032  23024   9088 S   0.0   2.3 120:22.17 containerd                                               
1493204 root      20   0 1850364  19216   4352 S   0.0   2.0  66:57.05 snapd                                                    
1546725 username  20   0   64908  15352    768 S   0.0   1.6   1:07.85 gunicorn                                                 
1565013 do-agent  20   0 1238600  14364   8704 S   0.0   1.5   1:08.77 do-agent                                                 
    277 root      19  -1  124224  13840  12620 S   0.0   1.4  16:00.35 systemd-journal                                          
    847 root      20   0  110012  12800   3456 S   0.0   1.3   0:00.12 unattended-upgr

The thread (https://github.com/fwupd/fwupd/issues/6948) indicates fwupd (a system daemon to allow session software to update firmware) has had memory leaks.

The post (https://www.reddit.com/r/debian/comments/1gilrlm/howto_device_firmware_updates_with_fwupd/) has commands but I didn't use that.

I used sudo reboot and then, once the system was up, the usage was

 System information as of Fri Jan 16 10:25:22 UTC 2026

  System load:           0.06
  Usage of /:            87.7% of 23.17GB
  Memory usage:          51%
  Swap usage:            0%
  Processes:             108
  Users logged in:       0

I then ran

sudo apt update
sudo apt upgrade

Thursday, December 4, 2025

website unresponsive: diagnostic steps and blocking the AI crawlers

I received an email alert this morning with subject "Wachete error notification" which indicated my website wasn't responsive.

In a browser I verified that https://allofphysics.com/ is hanging. (No immediate error response.) After a few minutes I got "504 Gateway Time-out nginx/1.17.9"

I ssh'd into the VPS (virtual private server) and ran
docker ps
to verify the containers were running.

I used
top
to verify the CPU load and memory load. Two instances of gunicorn are using 2% of the CPU each and 10% of the RAM each. That's expected.

Next I logged into the VPS web portal to review system usage for the past 7 days. There is certainly a noticeable change of metrics that started suddenly yesterday:

The last interaction I had with the server (more than a week ago) was to update the HTTPS certificates using Let's Encrypt. Although the website had returned a 504 error I could check the certificate expiration in the browser. The certs were valid.

In the logs directory on the server

-rwxrwxrwx 1 usr  usr          0 Sep  3  2024 auth.log*
-rw-r--r-- 1 usr  usr    6194913 Dec  3 14:38 flask_critical_and_error_and_warning.log
-rw-r--r-- 1 usr  usr     383596 Nov 26 06:27 flask_critical_and_error_and_warning.log.1
-rw-r--r-- 1 usr  usr    9999931 Nov 23 18:02 flask_critical_and_error_and_warning.log.2
-rw-r--r-- 1 usr  usr    9945758 Dec  3 14:42 flask_critical_and_error_and_warning_and_info.log
-rw-r--r-- 1 usr  usr    9999983 Dec  2 16:14 flask_critical_and_error_and_warning_and_info.log.1
-rw-r--r-- 1 usr  usr    9999938 Dec  1 16:40 flask_critical_and_error_and_warning_and_info.log.2
-rw-r--r-- 1 usr  usr    1206714 Dec  3 14:42 flask_critical_and_error_and_warning_and_info_and_debug.log
-rw-r--r-- 1 usr  usr    9999916 Dec  3 12:46 flask_critical_and_error_and_warning_and_info_and_debug.log.1
-rw-r--r-- 1 usr  usr    9999926 Dec  3 00:53 flask_critical_and_error_and_warning_and_info_and_debug.log.2
-rw-r--r-- 1 usr  usr  125459598 Dec  3 14:42 gunicorn_access.log
-rw-r--r-- 1 usr  usr  166722892 Dec  3 14:42 gunicorn_error.log
-rw-r--r-- 1 root root 126147128 Dec  4 11:01 nginx_access.log
-rw-r--r-- 1 root root  28785863 Dec  4 11:01 nginx_error.log

Only nginx logs have today's date. That's consistent with the blocker being nginx. Using
tail -f nginx_access.log
I see the latest entries are associated with https://webmaster.petalsearch.com/site/petalbot which says the crawler

"establish an index database which enables users to search the content of your site in Petal search engine and present content recommendations for the user in Huawei Assistant and AI Search services"

Using Gemini 2.5 Flash from https://aistudio.google.com/ I ask

I'm running a webserver that uses nginx and runs on linux. I am interested in blocking certain IP address ranges. Should I configure nginx to filter IP ranges or should I filter using the linux firewall? I want to use the software I already have rather than add yet another tool for this blocking.

and learn that linux firewall is recommended over nginx.

Next question for Gemini 2.5 Flash LLM is

I'm using Ubuntu for a webserver. How do I determine what firewall is being used from the command line?

I then run the following on my VPS:

$ sudo ufw status verbose
Status: active
Logging: on (low)
Default: deny (incoming), allow (outgoing), deny (routed)
New profiles: skip

To                         Action      From
--                         ------      ----
22/tcp (OpenSSH)           ALLOW IN    Anywhere                  
443                        ALLOW IN    Anywhere                  
80                         ALLOW IN    Anywhere                  
22/tcp (OpenSSH (v6))      ALLOW IN    Anywhere (v6)             
443 (v6)                   ALLOW IN    Anywhere (v6)             
80 (v6)                    ALLOW IN    Anywhere (v6)

I then also verify that nft exists using

$ sudo nft list ruleset
# Warning: table ip filter is managed by iptables-nft, do not touch!
table ip filter {
	chain ufw-before-logging-input {
	}
...long output, snipped...

From a few minutes of reviewing tail -f nginx_access.log the major offenders for this denial-of-service (DOS) attack appear to be

104.210.140.141 = OpenAI, observed 2025-12-04
114.119.147.137 = PetalBot (for Huawei), observed 2025-12-04
156.59.198.136 = bytedance, observed 2025-12-04

LLM query:

ufw block IP address range for web server

followed by

how to pick the correct CIDR value for IP blocking?

from which I learn /24 is the last octet (0 to 255)

I then run

$ sudo ufw deny from 156.59.198.136/24
WARN: Rule changed after normalization
Rule added
$ sudo ufw deny from 114.119.147.0/24
Rule added
$ sudo ufw deny from 104.210.140.0/24
Rule added

Check the results

$ sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
OpenSSH                    ALLOW       Anywhere                  
443                        ALLOW       Anywhere                  
80                         ALLOW       Anywhere                  
Anywhere                   DENY        156.59.198.0/24           
Anywhere                   DENY        114.119.147.0/24          
Anywhere                   DENY        104.210.140.0/24          
OpenSSH (v6)               ALLOW       Anywhere (v6)             
443 (v6)                   ALLOW       Anywhere (v6)             
80 (v6)                    ALLOW       Anywhere (v6)

The LLM had warned me that "If you find that the new deny rules are at the bottom of the list, you may need to use the insert function to put them at the top (e.g., position 1 and 2)."

Gemini 2.5 says the general best practice for firewall rules is:

Specific DENY rules (blocking known bad actors).
Specific ALLOW rules (allowing trusted hosts/networks).
General ALLOW rules (allowing public services).
General DENY rules (the default policy, often implied).

Gemini 2.5's advice was almost correct. The LLM got the rule indices wrong. Here are the commands I ran:

$ sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
OpenSSH                    ALLOW       Anywhere                  
443                        ALLOW       Anywhere                  
80                         ALLOW       Anywhere                  
Anywhere                   DENY        156.59.198.0/24           
Anywhere                   DENY        114.119.147.0/24          
Anywhere                   DENY        104.210.140.0/24          
OpenSSH (v6)               ALLOW       Anywhere (v6)             
443 (v6)                   ALLOW       Anywhere (v6)             
80 (v6)                    ALLOW       Anywhere (v6)             

$ sudo ufw status numbered
Status: active

     To                         Action      From
     --                         ------      ----
[ 1] OpenSSH                    ALLOW IN    Anywhere                  
[ 2] 443                        ALLOW IN    Anywhere                  
[ 3] 80                         ALLOW IN    Anywhere                  
[ 4] Anywhere                   DENY IN     156.59.198.0/24           
[ 5] Anywhere                   DENY IN     114.119.147.0/24          
[ 6] Anywhere                   DENY IN     104.210.140.0/24          
[ 7] OpenSSH (v6)               ALLOW IN    Anywhere (v6)             
[ 8] 443 (v6)                   ALLOW IN    Anywhere (v6)             
[ 9] 80 (v6)                    ALLOW IN    Anywhere (v6)             

$ sudo ufw delete 4
Deleting:
 deny from 156.59.198.0/24
Proceed with operation (y|n)? y
Rule deleted
$ sudo ufw insert 1 deny from 156.59.198.0/24 to any
Rule inserted
$ sudo ufw delete 4
Deleting:
 allow 80
Proceed with operation (y|n)? n
Aborted
$ sudo ufw status numbered
Status: active

     To                         Action      From
     --                         ------      ----
[ 1] Anywhere                   DENY IN     156.59.198.0/24           
[ 2] OpenSSH                    ALLOW IN    Anywhere                  
[ 3] 443                        ALLOW IN    Anywhere                  
[ 4] 80                         ALLOW IN    Anywhere                  
[ 5] Anywhere                   DENY IN     114.119.147.0/24          
[ 6] Anywhere                   DENY IN     104.210.140.0/24          
[ 7] OpenSSH (v6)               ALLOW IN    Anywhere (v6)             
[ 8] 443 (v6)                   ALLOW IN    Anywhere (v6)             
[ 9] 80 (v6)                    ALLOW IN    Anywhere (v6)             

$ sudo ufw delete 5
Deleting:
 deny from 114.119.147.0/24
Proceed with operation (y|n)? y
Rule deleted
$ sudo ufw insert 2 deny from 114.119.147.0/24 to any
Rule inserted
$ sudo ufw status numbered
Status: active

     To                         Action      From
     --                         ------      ----
[ 1] Anywhere                   DENY IN     156.59.198.0/24           
[ 2] Anywhere                   DENY IN     114.119.147.0/24          
[ 3] OpenSSH                    ALLOW IN    Anywhere                  
[ 4] 443                        ALLOW IN    Anywhere                  
[ 5] 80                         ALLOW IN    Anywhere                  
[ 6] Anywhere                   DENY IN     104.210.140.0/24          
[ 7] OpenSSH (v6)               ALLOW IN    Anywhere (v6)             
[ 8] 443 (v6)                   ALLOW IN    Anywhere (v6)             
[ 9] 80 (v6)                    ALLOW IN    Anywhere (v6)             

$ sudo ufw delete 6
Deleting:
 deny from 104.210.140.0/24
Proceed with operation (y|n)? y
Rule deleted
$ sudo ufw insert 3 deny from 104.210.140.0/24 to any
Rule inserted
$ sudo ufw status numbered
Status: active

     To                         Action      From
     --                         ------      ----
[ 1] Anywhere                   DENY IN     156.59.198.0/24           
[ 2] Anywhere                   DENY IN     114.119.147.0/24          
[ 3] Anywhere                   DENY IN     104.210.140.0/24          
[ 4] OpenSSH                    ALLOW IN    Anywhere                  
[ 5] 443                        ALLOW IN    Anywhere                  
[ 6] 80                         ALLOW IN    Anywhere                  
[ 7] OpenSSH (v6)               ALLOW IN    Anywhere (v6)             
[ 8] 443 (v6)                   ALLOW IN    Anywhere (v6)             
[ 9] 80 (v6)                    ALLOW IN    Anywhere (v6)

In a web browser I visited https://allofphysics.com/ and the page loaded immediately. Yay!

Sunday, November 23, 2025

a single step fully verified using SymPy and Lean

As very simple example, start with the equation

T = 1/f

where T is period of oscillation and f is frequency of oscillation. A transformation would be to multiply both sides by f to get

f T = 1

Verification of a step using a Computer Algebra System like SymPy

The single step above could be verified using a Computer Algebra System like SymPy. The generic form of the inference rule is "multiply both sides of (LHS=RHS) by feed to get (LHS*feed=RHS*feed)". To show the inference rule was correctly applied, we want to show that

LHS_in*feed == LHS_out
and
RHS_in*feed == RHS_out

Another way to describe the equivalence is that the difference should be zero:

LHS_in*feed - LHS_out = 0
and
RHS_in*feed - RHS_out = 0

That's the generic formulation of the inference rule check. In this step,

LHS_in = T
RHS_in = 1/f
feed = f
LHS_out = f T
RHS_out = 1

This check is implemented in line 496 of validate_steps_sympy.py as

import sympy
def multiply_both_sides_by(LHS_in, RHS_in, feed, LHS_out, RHS_out):
    diff1 = sympy.simplify(sympy.Mul(LHS_in, feed) - LHS_out)
    diff2 = sympy.simplify(sympy.Mul(RHS_in, feed) - RHS_out)
    if (diff1 == 0) and (diff2 == 0):
        return "valid"
    else:
        return "LHS diff is " + str(diff1) + "\n" + "RHS diff is " + str(diff2)

We can run that using

>>> import sympy
>>> print(sympy.__version__)
1.11.1
>>> f, T = sympy.symbols('f T')
>>> multiply_both_sides_by(T, 1/f, f, f*T, 1)
'valid'

Wahoo! The step has been validated using SymPy to show the inference rule is applied correctly.

The cool part is that the "multiply_both_sides()" Python function is generic to any input expression. The same check can be used for many different steps that use the inference rule. Using SymPy we can gain confidence that the expressions associated with a step were correctly transformed.

Consistency of dimensionality using SymPy

In addition to evaluating the transformation of symbols in a step, we can verify the consistency of dimensions for each expression. That requires more than just the symbols -- the user will have to specify the dimensions of each symbol.

For example, in JSON for period T we have

        "9491": {
            "category": "variable",
            "dimensions": {
                "amount of substance": 0,
                "electric charge": 0,
                "length": 0,
                "luminous intensity": 0,
                "mass": 0,
                "temperature": 0,
                "time": 1
            },
            "latex": "T",
            "name": "period",
            "scope": [
                "real"
            ]
        },

The script validate_dimensions_sympy.py

>>> import sympy
>>> from sympy.physics.units import mass, length, time, temperature, luminous_intensity, amount_of_substance, charge  # type: ignore
>>> from sympy.physics.units.systems.si import dimsys_SI

>>> from sympy.parsing.latex import parse_latex
>>> sympy.srepr(parse_latex('T = 1/f'))

TODO
>>> determine_consistency = dimsys_SI.equivalent_dims( eval(str(LHS)), eval(str(RHS)) )

See https://physicsderivationgraph.blogspot.com/2020/07/function-latex-for-sympy.html

Verification of the step using Lean

To prove

(T=1/f) -> (f*T=1)

additional constraints are needed for reasoning around division by 0. If you expect to avoid negative or zero frequency or period, you could define f and T to have be "positive real numbers" (which exclude zero; non-negative reals include zero). This does define the context more precisely, but there is a price - we won’t have nearly as many proofs for positive reals as we have for reals. The alternative is to add additional hypotheses as constraints. The latter case (additional hypotheses) is favored.

import Mathlib.Data.Real.Basic
import Mathlib.Tactic -- Import standard tactics, specifically import Mathlib.Tactic.FieldSimp

theorem inversion_amultbeq1
  (a b : Real)
  (hb_ne_zero : b ≠ 0) :
  a = 1 / b <-> a * b = 1 := by
  -- field_simp clears the denominator 'b' on the LHS,
  -- turning (a = 1 / b) into (a * b = 1) automatically.
  field_simp [hb_ne_zero]

Sunday, November 2, 2025

finding an alternative Neo4j; a schema for SQL

I had been using Neo4j's property graph for the Physics Derivation Graph for a few years and recently stopped due to the 1GB of memory used by the server that is constantly running. (That 1GB of RAM is unrelated to the size of the underlying database, which in my case was a few hundred nodes.)

I realized that what I want is a SQLite-like "offline" database that only is noticeable to CPU and RAM use when there's a query. I'm ok with the database's initialization cost resulting slower queries.

Search for Alternatives

https://jgrapht.org/ and NetworkX are irrelevant to the needs of the Physics Derivation Graph

https://en.wikipedia.org/wiki/Blazegraph is unsupported since 2018

KuzuDB was exactly what I was interested in -- an embedded offline property database comptible with Cypher. Coincidentally, https://kuzudb.github.io/ announced Oct 10 (3 weeks ago) that the project was being terminated immediately.
https://news.ycombinator.com/item?id=45560036
https://www.reddit.com/r/Database/comments/1o3w73z/status_of_kuzudb_from_kuzu_inc/
https://www.theregister.com/2025/10/14/kuzudb_abandoned/
A fork: https://github.com/Kineviz/bighorn

Possible contenders:

However, the most reliable approach may be to switch to SQLite and use LLMs to craft Cypher-like queries. This is inelegant (I had been exposing read-only Cypher capabilities on the web) and requires coming up with a schema (something I have no experience with).

Convert Property Graph to Relational Database for SQL

Here's the prompt I provided Gemini 2.5 Pro:

I have a complicated property graph in Neo4j and want to convert to SQLite. Design a schema for the relational database.

The nodes present in the property graph representation are below.


node type: 'derivation'
properties={'author_name_latex': str,  
  'name_latex': str,     
  'id': str,  
  'abstract_latex': str,  
  'reference_latex': str, 
  'created_datetime': str} 

node type: 'step'
properties={'author_name_latex': str,   
'note_before_step_latex': str, 
'note_after_step_latex': str, 
'id': str,   
 'created_datetime': str} 

node type: 'feed'
properties={'author_name_latex': str,  
'id': str,  
'sympy': str, 
'lean': str,   
 'latex': str} 

node type: 'expression'
properties={'author_name_latex': str,   
 'name_latex': str, 
 'description_latex': str,  
 'reference_latex': str,   
 'id': str,         
 'latex': str,  
 'lean': str,   
 'sympy': str}  

node type: 'operation'
properties={'author_name_latex': str, 
'name_latex': str,  'argument_count': int,  
'description_latex': str,     
 'reference_latex': str, 
   'id': str,  
   'latex': str} 

node type: 'scalar'
properties={'author_name_latex': str,
 'name_latex': str, 
 'description_latex': str,  
 'id': str,         
 'reference_latex': str,   
 'latex': str,     
 'scope': str, 
 'variable_or_constant': str,  
 'domain': str,   
 'dimension_length': int,      
 'dimension_time': int,        
 'dimension_mass': int,        
 'dimension_temperature': int, 
 'dimension_electric_charge': int,     
 'dimension_amount_of_substance': int, 
 'dimension_luminous_intensity': int}  

node type: 'value_and_units'
properties={'author_name_latex': str
 'id': str,         
 'dimension_length_unit': int, 
 'dimension_time_unit': int,   
 'dimension_mass_unit': int,   
 'dimension_temperature_unit': int, 
 'dimension_electric_charge_unit': int,  
 'dimension_amount_of_substance_unit': int,
 'dimension_luminous_intensity_unit': int, 

node type: 'vector'
properties={'author_name_latex': str,
 'name_latex': str, 
 'description_latex': str,  
 'reference_latex': str,   
 'id': str,         
 'latex': str}     

node type: 'matrix'
properties={'author_name_latex': str,
 'name_latex': str, 
 'description_latex': str,  
 'id': str,         
 'reference_latex': str,   
 'latex': str}     

node type: 'quantum_operator'
properties={'author_name_latex': str,
 'name_latex': str, 
 'description_latex': str,  
 'requires_arguments': bool,
 'reference_latex': str,   
 'id': str,         
 'latex': str}

node type: 'inference_rule'
properties={'author_name_latex': str,  
 'number_of_inputs': int,     
 'number_of_outputs': int,    
 'number_of_feeds': int,      
 'name_latex': str, 
 'id': str,         
 'latex': str}      





The nodes and edges occur only in certain combinations described below.


edge 'HAS_STEP' properties={'sequence_index': int}
connects node type "derivation"
to node type "step"

edge 'HAS_INPUT' properties={'sequence_index': int}
connects node type "step"
to node type "expression"

edge 'HAS_FEED' properties={'sequence_index': int}
connects node type "step"
to node type "feed"

edge 'HAS_OUTPUT' properties={'sequence_index': int}
connects node type "step"
to node type "expression"

edge 'HAS_INFERENCE_RULE'
connects node type "step"
to node type "inference_rule"


edge 'HAS_SYMBOL'
connects node type "expression"
to node type "scalar"

edge 'HAS_SYMBOL'
connects node type "expression"
to node type "vector"

edge 'HAS_SYMBOL'
connects node type "expression"
to node type "matrix"

edge 'HAS_SYMBOL'
connects node type "expression"
to node type "operation"

edge 'HAS_SYMBOL'
connects node type "expression"
to node type "quantum_operator"

edge 'HAS_SYMBOL'
connects node type "feed"
to node type "scalar"

edge 'HAS_SYMBOL'
connects node type "feed"
to node type "vector"

edge 'HAS_SYMBOL'
connects node type "feed"
to node type "matrix"

edge 'HAS_SYMBOL'
connects node type "feed"
to node type "operation"

edge 'HAS_SYMBOL'
connects node type "feed"
to node type "quantum_operator"


edge 'HAS_VALUE'
connects node type "scalar"
to node type "value_and_units"

Response from Gemini:

Converting a property graph to a relational model involves translating nodes into tables and edges into relationships (either through foreign keys or dedicated join tables). Here is a comprehensive SQLite schema designed to represent your Neo4j graph structure, balancing relational purity with the flexibility of the original graph.

Design Principles

Nodes as Tables: Each distinct node type (`derivation`, `step`, `expression`, etc.) will become its own table.
Properties as Columns: The properties of each node will become columns in the corresponding table. The node's `id` will serve as the `PRIMARY KEY`.
Edges as Relationships:
- One-to-Many / One-to-One Edges: These are modeled using a foreign key column in the "many" or "child" table. For example, since a `step` belongs to one `derivation`, the `steps` table will have a `derivation_id` column.
- Many-to-Many Edges: These are modeled using a separate "join table" (or "link table") that contains foreign keys to the two connected tables. For example, a `step` can have many `expression` inputs, and an `expression` can be an input to many `steps`. This requires a `step_inputs` join table.
Polymorphic Edges: The `HAS_SYMBOL` edge is polymorphic because it can connect to multiple different types of symbol nodes (`scalar`, `vector`, etc.). To handle this cleanly without losing foreign key integrity, we will create a dedicated join table for each specific relationship (e.g., `expression_scalars`, `expression_vectors`). A more "graph-like" but less relationally pure alternative is discussed in the rationale section.

SQLite Schema Definition

Here are the `CREATE TABLE` statements for the SQLite database. Node Tables. These tables store the data for each type of node.

-- The top-level container for a mathematical derivation.
CREATE TABLE derivations (
    id TEXT PRIMARY KEY,
    author_name_latex TEXT,
    name_latex TEXT,
    abstract_latex TEXT,
    reference_latex TEXT,
    created_datetime TEXT NOT NULL
);

-- Represents a single logical step in a derivation.
-- It connects to a derivation (one-to-many) and an inference_rule (many-to-one).
CREATE TABLE steps (
    id TEXT PRIMARY KEY,
    derivation_id TEXT NOT NULL,
    inference_rule_id TEXT NOT NULL,
    sequence_index INTEGER NOT NULL, -- From the 'HAS_STEP' edge property
    author_name_latex TEXT,
    note_before_step_latex TEXT,
    note_after_step_latex TEXT,
    created_datetime TEXT NOT NULL,
    FOREIGN KEY (derivation_id) REFERENCES derivations(id) ON DELETE CASCADE,
    FOREIGN KEY (inference_rule_id) REFERENCES inference_rules(id) ON DELETE RESTRICT,
    UNIQUE (derivation_id, sequence_index) -- A derivation cannot have two steps at the same index
);

-- Represents the mathematical content of an expression.
CREATE TABLE expressions (
    id TEXT PRIMARY KEY,
    author_name_latex TEXT,
    name_latex TEXT,
    description_latex TEXT,
    reference_latex TEXT,
    latex TEXT,
    lean TEXT,
    sympy TEXT
);

-- Represents feed-in values or assumptions for a step.
CREATE TABLE feeds (
    id TEXT PRIMARY KEY,
    author_name_latex TEXT,
    sympy TEXT,
    lean TEXT,
    latex TEXT
);

-- Represents a rule of logic or mathematics used in a step.
CREATE TABLE inference_rules (
    id TEXT PRIMARY KEY,
    author_name_latex TEXT,
    name_latex TEXT,
    latex TEXT,
    number_of_inputs INTEGER,
    number_of_outputs INTEGER,
    number_of_feeds INTEGER
);

-- Represents a mathematical operation like addition or differentiation.
CREATE TABLE operations (
    id TEXT PRIMARY KEY,
    author_name_latex TEXT,
    name_latex TEXT,
    description_latex TEXT,
    reference_latex TEXT,
    latex TEXT,
    argument_count INTEGER
);

-- Represents a scalar value (a single number).
CREATE TABLE scalars (
    id TEXT PRIMARY KEY,
    author_name_latex TEXT,
    name_latex TEXT,
    description_latex TEXT,
    reference_latex TEXT,
    latex TEXT,
    scope TEXT,
    variable_or_constant TEXT,
    domain TEXT,
    dimension_length INTEGER,
    dimension_time INTEGER,
    dimension_mass INTEGER,
    dimension_temperature INTEGER,
    dimension_electric_charge INTEGER,
    dimension_amount_of_substance INTEGER,
    dimension_luminous_intensity INTEGER
);

-- Represents a vector.
CREATE TABLE vectors (
    id TEXT PRIMARY KEY,
    author_name_latex TEXT,
    name_latex TEXT,
    description_latex TEXT,
    reference_latex TEXT,
    latex TEXT
);

-- Represents a matrix.
CREATE TABLE matrices (
    id TEXT PRIMARY KEY,
    author_name_latex TEXT,
    name_latex TEXT,
    description_latex TEXT,
    reference_latex TEXT,
    latex TEXT
);

-- Represents a quantum mechanical operator.
CREATE TABLE quantum_operators (
    id TEXT PRIMARY KEY,
    author_name_latex TEXT,
    name_latex TEXT,
    description_latex TEXT,
    reference_latex TEXT,
    latex TEXT,
    requires_arguments INTEGER -- Using INTEGER for BOOLEAN (0=false, 1=true)
);

-- Represents the numerical value and units for a scalar.
-- This has a one-to-one relationship with a scalar.
CREATE TABLE values_and_units (
    id TEXT PRIMARY KEY,
    scalar_id TEXT NOT NULL UNIQUE, -- The UNIQUE constraint enforces the 1-to-1 relationship
    author_name_latex TEXT,
    dimension_length_unit INTEGER,
    dimension_time_unit INTEGER,
    dimension_mass_unit INTEGER,
    dimension_temperature_unit INTEGER,
    dimension_electric_charge_unit INTEGER,
    dimension_amount_of_substance_unit INTEGER,
    dimension_luminous_intensity_unit INTEGER,
    FOREIGN KEY (scalar_id) REFERENCES scalars(id) ON DELETE CASCADE
);

Relationship (Join) Tables. These tables model the many-to-many relationships defined by the edges.

-- Links steps to their input expressions.
CREATE TABLE step_inputs (
    step_id TEXT NOT NULL,
    expression_id TEXT NOT NULL,
    sequence_index INTEGER NOT NULL,
    PRIMARY KEY (step_id, sequence_index),
    FOREIGN KEY (step_id) REFERENCES steps(id) ON DELETE CASCADE,
    FOREIGN KEY (expression_id) REFERENCES expressions(id) ON DELETE CASCADE
);

-- Links steps to their output expressions.
CREATE TABLE step_outputs (
    step_id TEXT NOT NULL,
    expression_id TEXT NOT NULL,
    sequence_index INTEGER NOT NULL,
    PRIMARY KEY (step_id, sequence_index),
    FOREIGN KEY (step_id) REFERENCES steps(id) ON DELETE CASCADE,
    FOREIGN KEY (expression_id) REFERENCES expressions(id) ON DELETE CASCADE
);

-- Links steps to their feed-in values.
CREATE TABLE step_feeds (
    step_id TEXT NOT NULL,
    feed_id TEXT NOT NULL,
    sequence_index INTEGER NOT NULL,
    PRIMARY KEY (step_id, sequence_index),
    FOREIGN KEY (step_id) REFERENCES steps(id) ON DELETE CASCADE,
    FOREIGN KEY (feed_id) REFERENCES feeds(id) ON DELETE CASCADE
);

-- Join tables for the polymorphic 'HAS_SYMBOL' relationships
-- These tables link expressions and feeds to the various symbol types they contain.

-- Expression-to-Symbol Links
CREATE TABLE expression_symbols (
    expression_id TEXT NOT NULL,
    symbol_id TEXT NOT NULL,
    symbol_type TEXT NOT NULL, -- e.g., 'scalar', 'vector', 'matrix', 'operation', 'quantum_operator'
    PRIMARY KEY (expression_id, symbol_id),
    FOREIGN KEY (expression_id) REFERENCES expressions(id) ON DELETE CASCADE
);

-- Feed-to-Symbol Links
CREATE TABLE feed_symbols (
    feed_id TEXT NOT NULL,
    symbol_id TEXT NOT NULL,
    symbol_type TEXT NOT NULL, -- e.g., 'scalar', 'vector', 'matrix', 'operation', 'quantum_operator'
    PRIMARY KEY (feed_id, symbol_id),
    FOREIGN KEY (feed_id) REFERENCES feeds(id) ON DELETE CASCADE
);

Rationale and Explanation

Direct Relationships (`derivation` -> `step`, `step` -> `inference_rule`): The `HAS_STEP` edge connects one derivation to many steps. This is a classic one-to-many relationship, best modeled by adding a `derivation_id` foreign key to the `steps` table. The `sequence_index` from the edge property is also placed here. The same logic applies to the `HAS_INFERENCE_RULE` edge.

Step Inputs/Outputs/Feeds: A `step` can have multiple ordered inputs, outputs, and feeds. These are many-to-many relationships modeled with the join tables `step_inputs`, `step_outputs`, and `step_feeds`. The `sequence_index` property from the edge is a crucial part of the primary key to maintain order and uniqueness.

One-to-One Relationship (`scalar` -> `value_and_units`): The `HAS_VALUE` edge is modeled by adding a `scalar_id` foreign key to the `values_and_units` table. Adding a `UNIQUE` constraint on `scalar_id` ensures that each scalar can have at most one `value_and_units` entry, perfectly modeling the one-to-one relationship.

Handling Polymorphism (`HAS_SYMBOL`): This is the most complex relationship to model. An `expression` or a `feed` can contain many symbols of different types.

Chosen Approach: The schema uses two tables, `expression_symbols` and `feed_symbols`. Each table stores the ID of the source (`expression_id` or `feed_id`), the ID of the target symbol (`symbol_id`), and a `symbol_type` column to tell you which table to join with to get the symbol's details (e.g., `scalars`, `vectors`).
Advantage: This approach is pragmatic. It keeps the number of tables manageable.
Trade-off: It sacrifices full relational integrity because you cannot create a native `FOREIGN KEY` on `symbol_id` that points to five different tables. You would enforce this relationship at the application layer (e.g., when you retrieve an `expression`'s symbols, you read the `symbol_type` and then query the appropriate table).

In my assessment, the SQL schema is inelegant compared to the property graph representation.

The use of `ON DELETE CASCADE` is a guess by the LLM and is not always the desired behavior. (The `ON DELETE CASCADE` means that if the parent gets deleted then automatically delete the linked child.)
The alternative is `ON DELETE RESTRICT` (or `NO ACTION`). It prevents deletion of the parent row as long as any child rows exist. The database would return an error. This is used when you want to protect the child data.

The above tailored-to-PDG schema is complicated. A simpler schema like the one below incurs a higher query cost.

From Gemini I learned that the EAV (Entity-Attribute-Value) model is a normalized and traditional approach that is portable across almost any relational database.

links to papers on Hilbert's Sixth problem

https://en.wikipedia.org/wiki/Hilbert%27s_sixth_problem

"A Field Guide to Recent Work on the Foundations of Statistical Mechanics" by R. Frigg
arxiv.org/abs/0804.0399 and
https://romanfrigg.org/wp-content/uploads/writings/SM_Fieldguide.pdf

"Hilbert's Sixth Problem: the endless road to rigour" by A. N. Gorban
https://arxiv.org/abs/1803.03599

"Long time derivation of the Boltzmann equation from hard sphere dynamics" by Yu Deng, Zaher Hani, Xiao Ma
https://arxiv.org/abs/2408.07818

"Hilbert's sixth problem: derivation of fluid equations via Boltzmann's kinetic theory" by Yu Deng, Zaher Hani, Xiao Ma
https://arxiv.org/abs/2503.01800

comments:

"Comment on "Hilbert's Sixth Problem: Derivation of Fluid Equations via Boltzmann's Kinetic Theory" by Deng, Hani, and Ma" by Gao
https://arxiv.org/abs/2504.06297
https://philsci-archive.pitt.edu/25009/1/Hilbert6c.pdf

Friday, October 17, 2025

I know more now than I did a few years ago

Why do I think I know more now than I did a few years ago?

Some of my assumptions have been validated or invalidated by experience of implementation
Some things I thought I knew have gained more nuance
I have gained insight on what needs to be learned - new domains, additional details
I better understand better how previously disparate concepts are related
Implementation forces decisions about design that a mental model is not confronted with
Experience from implementing provides nuance and validation
Sometimes I have to undo previously made decisions due to new insights
I better understand which paths are deadends and which are worth investment

Interestingly, even though I know that I need to learn more I cannot a priori know which nuance is relevant, which path is a deadend.

Saturday, October 4, 2025

resizing to smaller VPS

I'm currently paying $12/month for my VPS (virtual private server) that has 2GB of RAM and 25GB of storage. I had been paying $6/month previously, but including Neo4j bumped the memory usage to nearly 100%. (CPU usage is averaging around 10% and spikes to 30%.)

I have a few iterations of the source code for the webserver so I'll need to figure out which is actually in use. After logging into the VPS, I see I have two folders:

allofphysics.com
ui_v7_website_flask_json

Using docker images I see the latest image (from 7 months ago) is allofphysicscom-flask. That doesn't help me figure out which repo is in use.

allofphysics.com/docker-compose.yaml has neo4j, whereas ui_v7_website_flask_json/docker-compose.yaml does not. Therefore I'm currently operating out of allofphysics.com/docker-compose.yaml

I have two options: either revert to the "v7" or disable neo4j in "allofphysics.com".

Going with option 1,

~/allofphysics.com$ docker compose down
~/allofphysics.com$ cd ~/ui_v7_website_flask_json
~/ui_v7_website_flask_json$ git pull
~/ui_v7_website_flask_json$ make

Turns out the packages are pretty old. The opencv package wasn't compatible. After I reverted all pip packages to known good version numbers I found the crypto libraries weren't happy. Eventually I was able to get a docker image built.

Next the certs weren't present so I copied those from ~/allofphysics.com/certs/ and that worked.

To free up disk space I ran

docker images | grep none | tr -s ' ' | cut -d' ' -f3 | xargs docker rmi

I powered down the VPS, changed to a smaller size, powered on, and then ran make to get a running webserver.

Tuesday, June 17, 2025

best practices for use of LLMs

I've previously written about best practices for prompts. This post is more abstract.

Avoid asking factual questions

The LLM is not a database of facts. Historical events, dates, places are not stored as exact references. LLMs generate their response based on statistical probabilities derived from patterns.

The more widely documented something is, the better the LLM knows it

The LLM's training is roughly proportional to the representation of the information on the Internet. An LLM is more reliable and detailed when discussing common knowledge.

Precise questions using relevant jargon with context yields useful output

Poorly worded questions that do not use domain-specific terminology are less likely to produce clear answers.

Do not trust citations

The LLM does not have citations hard-coded into the network. Citations are most likely to be hallucinations

Decompose complex tasks and questions into a sequence of iterative prompts

There is a limited amount of "thinking" by the LLM per prompt, so simpler tasks are more likely to produce relevant answers.

Structure your question to produce a page or less of output

Producing a 200 page book from a single prompt devolves into hallucinations after a few pages. Shorter answers are more likely to remain lucid, so phrase your question in a way that can be answered with a small amount of text.

LLMs default to the average

While LLM output can be creative (in unexpected ways), seeking exceptional insight yields the mundane

Simplify your question to a one-shot prompt

Iterative questions are more likely to yield hallucinations

Delegation to an intern who doesn't learn

This can be confusing, as the LLM occasionally knows more than you do.

Wednesday, May 7, 2025

arm64 and amd64 docker builds

Makefile

# Get the machine architecture.
# On arm64 (Apple Silicon M1/M2/etc.), `uname -m` outputs "arm64".
# On amd64 (Intel), `uname -m` outputs "x86_64".
ARCH := $(shell uname -m)

ifeq ($(ARCH), arm64)
        this_arch=arm64
else ifeq ($(ARCH), x86_64)
        this_arch=amd64
else
        @echo "Unknown architecture: $(ARCH). Cannot determine if Mac is new or old."
endif

`docker push` multiple platforms using `buildx`

I wasn't able to get docker manifest to amend both amd64 and arm64, so I used the buildx approach:

docker buildx build --push --platform linux/arm64,linux/amd64 --tag benislocated/allofphysicscom-flask:latest .

Result:

ERROR: Multi-platform build is not supported for the docker driver.
Switch to a different driver, or turn on the containerd image store, and try again.
Learn more at https://docs.docker.com/go/build-multi-platform/

References

https://www.docker.com/blog/multi-arch-build-and-images-the-simple-way/

Sunday, May 4, 2025

requirements generation

BLUF/tl;dr: methods of requirements generation are described in this post: "one-shot think hard and brainstorm", learn from others, iterative adversarial feedback, and formal models.

As a solo hobby developer I get to do what feels right with regard to my project. That autonomy and freedom applies to both prioritization and scope. Part of the joy of working is based on being able to follow my curiosity, and to do so at a time and rate of my choosing.

When I work with another person on a shared goal, then there is value in identifying how to divide tasks. For example, by skill, by interests, by availability.

A vision distills into use cases, and these use cases determine requirements which determine tasks. Then sequencing and parallelization of tasks can happen. Let's refer to this as the "think hard and brainstorm" waterfall method. The success of waterfall relies on the ability of planners to identify all contingencies before taking action. Use of an LLM for generating requirements fits in this category as an alternative to thinking hard.

If someone else has a similar situation, learning from their requirements is a valid way of making progress. Plagiarism is acceptable; no need for being original.

The optimistic waterfall method described above assumes the alignment of incentives for participants doing the tasks. If the participants doing tasks are looking for the easiest solution to the requirement they may provided results that don't satisfy the vision.

If the folks satisfying a requirement may be adversarial, that can be accounted for in an iterative manner.

think hard and brainstorm to come up with an initial draft of requirements
provide the draft requirements to adversarial works with the instructions, "provide a solution in a day." Leverage their creativity to provide an insufficient result.
Determine why the adversarial solutions (which do meet the requirements) don't satisfy the vision. Use that insight to develop better requirements.

Repeat the iterations until requirements are "fool proof" for the sample pool of fools.

A third method of coming up with requirements is to use formal methods. For example,

"Program Derivation is the practice of beginning with a specification of a function, and by a series of mechanical steps, deriving an efficient implementation." (source: https://www.youtube.com/watch?v=JDqD6RZpnZA)

https://p-org.github.io/P/ and https://github.com/p-org/P
https://www.youtube.com/watch?v=FdXZXnkMDxs
https://www.youtube.com/watch?v=tZnX585r4ms

https://www.cs.toronto.edu/~hehner/aPToP/aPToP.pdf

Tuesday, April 22, 2025

formal evaluation of adding one in Python

Here's what I want to accomplish:

program requirements, v1

objective: add one to a user-provided value

And here's an implementation in Python that I think provides that capability:

program implementation, v1

#!/usr/bin/env python3
import sys
"""
how hard can it be to add one to a user-provided value?
"""

def add_one_to_arg(number_str: str):
    """
    do the addition
    """
    try:
        number = int(number_str)
    except ValueError:
        try:
            number = float(number_str)
        except ValueError:
            print(f"Error: '{number_str}' is not a valid float.")
            print(f"Usage: python {sys.argv[0]} <number>")
            sys.exit(1) # Exit with an error status code

    return number + 1
    
if __name__ == "__main__":

    if len(sys.argv) < 2:
        print("Error: Please provide a number as a command-line argument.")
        print(f"Usage: python {sys.argv[0]} <number>")
        sys.exit(1) # Exit with a non-zero status code to indicate an error

    # Get the first argument (index 1)
    number_str = sys.argv[1]

    result = add_one_to_arg(number_str)

    print(f"{number_str} plus one is {result}")

#EOF

When I run the above script I get

$ python3 add_one.py 4
4 plus one is 5
$ python3 add_one.py 4.2
4.2 plus one is 5.2
$ python3 add_one.py cat
Error: 'cat' is not a valid float.
Usage: python add_one.py <number>

Next I'm going to intentionally add a few bugs and then ask how to prove the implementation has no bugs:

program implementation, v2 (with bugs)

#!/usr/bin/env python3

import sys
import random
import os
"""
how hard can it be to add one to a user-provided value?
"""

def add_one_to_arg(number_str: str):
    """
    do the addition
    """
    try:
        number = int(number_str)
    except ValueError:
        try:
            number = float(number_str)
        except ValueError:
            print(f"Error: '{number_str}' is not a valid float.")
            print(f"Usage: python {sys.argv[0]} <number>")
            sys.exit(1) # Exit with an error status code

    # deterministic bug
    if number == 3.14159265359:
        return 6

    # random bug
    if random.random()<0.001:
        return number+2

    # bug that is specific to user environment
    my_var = os.environ.get("ASDF")
    if my_var:
        return number+3

    return number + 1

if __name__ == "__main__":

    if len(sys.argv) < 2:
        print("Error: Please provide a number as a command-line argument.")
        print(f"Usage: python {sys.argv[0]} <number>")
        sys.exit(1) # Exit with a non-zero status code to indicate an error

    # Get the first argument (index 1)
    number_str = sys.argv[1]

    result = add_one_to_arg(number_str)

    print(f"{number_str} plus one is {result}")

#EOF

I've added three bugs in v2: a deterministic bug, a random bug, and bug that depends on the user's environment. A brute force test would be expensive but could identify the first two bugs.

There are a couple problems with v1 of the program requirements to "add one to a user-provided value."

The input range (unstated) is negative infinity to positive infinity.
- Python does not have a built-in limit for the size of integers. The maximum integer value is restricted only by the available memory of the system.
- For floats there is an upper bound:
```
>>> import sys
>>> sys.float_info.max
1.7976931348623157e+308
```
Time-out conditions are unspecified. So if the program doesn't respond for 5 minutes, the requirements have nothing to say about that.

Rewrite the program requirements to be more specific:

program requirements, v2

If the "computer" doing this calculation has a 1Hz CPU clock frequency with 1byte of RAM, that might result in the Python program being "right" but the hardware being inadequate.

Also, let's make explicit the assumption that we are operating in base 10 numbers.

To be safe with the input string, let's bound that to be less than 1000 characters.

program assumptions and requirements, v3

objective: add one to a user-provided value.
constraint: User-provided value must be between -1E100 and +1E100.
constraint: Response to user must be provided less than 1 minute after user input.
constraint: user-provided input and the result are both base 10 numbers.
constraint: user-provided input is less than 1000 characters.
assumption: the program is run on a MacBook Air M2 with 8GB of RAM with factory settings.

The revised implementation is

program implementation, v3 -- with bugs and constraints

#!/usr/bin/env python3

import sys
import random
import os
"""
how hard can it be to add one to a user-provided value?
"""

def add_one_to_arg(number_str: str):
    """
    do the addition
    """
    try:
        number = int(number_str)
    except ValueError:
        try:
            number = float(number_str)
        except ValueError:
            print(f"Error: '{number_str}' is not a valid float.")
            print(f"Usage: python {sys.argv[0]} <number>")
            sys.exit(1) # Exit with an error status code

    assert(number<1E100)
    assert(number>-1E100)

    # deterministic bug
    if number == 3.14159265359:
        return 6

    # random bug
    if random.random()<0.001:
        return number+2

    # bug that is specific to user environment
    my_var = os.environ.get("ASDF")
    if my_var:
        return number+3

    return number + 1

if __name__ == "__main__":

    if len(sys.argv) < 2:
        print("Error: Please provide a number as a command-line argument.")
        print(f"Usage: python {sys.argv[0]} <number>")
        sys.exit(1) # Exit with a non-zero status code to indicate an error

    # Get the first argument (index 1)
    number_str = sys.argv[1]

    assert(len(number_str)<1000)

    result = add_one_to_arg(number_str)

    print(f"{number_str} plus one is {result}")

#EOF

Normal testing involves evaluating pre-determined cases, like "input 5, get 6" and "input 5.4, get 6.4" and "input 'cat', get error" and "input (nothing), get error."

program implementation, v4 -- with pytest

#!/usr/bin/env python3

import sys
import random
import os
"""
how hard can it be to add one to a user-provided value?
"""

def add_one(number_str: str):
    """
    do the addition
    """
    try:
        number = int(number_str)
    except ValueError:
        try:
            number = float(number_str)
        except ValueError:
            print(f"Error: '{number_str}' is not a valid float.")
            print(f"Usage: python {sys.argv[0]} <number>")
            sys.exit(1) # Exit with an error status code

    assert(number<1E100)
    assert(number>-1E100)

    # deterministic bug
    if number == 3.14159265359:
        return 6

    # random bug
    if random.random()<0.001:
        return number+2

    # bug that is specific to user environment
    my_var = os.environ.get("ASDF")
    if my_var:
        return number+3

    return number + 1

def test_add_one_to_int():
    result = add_one(number_str="5")
    assert result == 6

def test_add_one_to_float():
    result = add_one(number_str="5.3")
    assert result == 6.3
    
def test_add_one_to_nuthin():
    with pytest.raises(SystemExit):
        result = add_one(number_str="")    

#EOF

Property-based testing (e.g., https://hypothesis.readthedocs.io/en/latest/) is where you "write tests which should pass for all inputs in whatever range you describe, and let Hypothesis randomly choose which of those inputs to check - including edge cases you might not have thought about."

program implementation, v4 -- with hypothesis

#!/usr/bin/env python3

import sys
import random
import os
import pytest
import hypothesis
"""
how hard can it be to add one to a user-provided value?
"""

def add_one(number_str: str):
    """
    do the addition
    """
    try:
        number = int(number_str)
    except ValueError:
        try:
            number = float(number_str)
        except ValueError:
            print(f"Error: '{number_str}' is not a valid float.")
            print(f"Usage: python {sys.argv[0]} <number>")
            sys.exit(1) # Exit with an error status code

    assert(number<1E100)
    assert(number>-1E100)

    # deterministic bug
    if number == 3.14159265359:
        return 6

    # random bug
    if random.random()<0.001:
        return number+2
        
    # bug that is specific to user environment
    my_var = os.environ.get("ASDF")
    if my_var:
        return number+3

    return number + 1

@hypothesis.given(number=(hypothesis.strategies.integers(-1E10, 1E10) | hypothesis.strategies.floats(-1E10, 1E10, allow_nan=False)))
@hypothesis.settings(max_examples=1000) # default is 200
def test_add_one_properties(number):
    result = add_one(number_str=str(number))
    assert result == number+1

#EOF

To run the above script, use

  pytest name_of_file.py

Similarly, https://github.com/zlorb/PyModel is a model checker that generates test cases based on constraints.

This blog post is about formal methods. There are a few options:

formal design specification

https://github.com/fizzbee-io/FizzBee is a design specification language. FizzBee is supposed to be before you start coding

formal implementation verification

https://github.com/marcoeilers/nagini, a static verification tool for Python using Viper, is an implementation verification language. https://www.youtube.com/watch?v=PIwP3SuWLb0
https://deal.readthedocs.io/basic/verification.html is an implementation verification language.
https://github.com/pschanely/CrossHair: a static verification tool for Python using symbolic execution. "Repeatedly calls your functions with symbolic inputs. It uses an SMT solver (a kind of theorem prover) to explore viable execution paths and find counterexamples for you."

https://github.com/formal-land/coq-of-python - Translate Python code to Coq code for formal verification. "formal-land" is a commercial company selling verification-as-a-service:
https://github.com/arsalan0c/dafny-of-python: translates a program written in a subset of typed Python along with its specification, to the Dafny verification language. Uses Dune, a build system for OCaml projects. Last update was 4 years ago. See https://dune.build/ and https://dune.readthedocs.io/en/latest/index.html

Design by Contract (https://en.wikipedia.org/wiki/Design_by_contract) approaches for Python include Dafny, Deal, and icontract.
For Dafny you write the program in Dafny and compile to Python.
For Deal you write Python and provide decorators.

Dafny

https://en.wikipedia.org/wiki/Dafny

"Dafny lifts the burden of writing bug-free code into that of writing bug-free annotations."
Dafny was created by Rustan Leino at Microsoft Research.
Dafny uses the Z3 automated theorem prover and Boogie.

Boogie is a simple programming language that is meant to be

an easy compile target (think "like JVM bytecode, but for proving code correct")
easy to analyze soundly
not actually intended to be executable

Instead of running Boogie programs, the Boogie compiler looks through the Boogie code to find assertions. For each assertion, the compiler generates a "verification condition", which is a formula based on a (symbolic) analysis of the program; the verification condition formula is constructed so that if the verification condition is true, the assertion holds.
It then hands those verification conditions, along with annotations in the program like assumptions, preconditions, postconditions, and loop invariants, to an SMT solver (Boogie uses Z3 by default). The SMT solver determines whether or not the assumptions definitely ensure the verification condition holds; Boogie complains about the assertions whose verification-conditions haven't been shown to hold.

source: https://www.reddit.com/r/ProgrammingLanguages/comments/tc55ld/how_does_the_dafny_programming_language/

https://dafny.org/dafny/OnlineTutorial/guide.html

Since "add 1" doesn't have a loop, the main aspects we'll need in Dafny are

@requires(...): Preconditions -- what must be true before the function is called.
@ensures(...): Postconditions -- what must be true after the function returns normally.

Requires input is a number (int or float)

Ensures input < output

Deal

Integrates with Z3 prover

https://github.com/life4/deal

https://github.com/life4/deal-solver

https://deal.readthedocs.io/

https://deal.readthedocs.io/basic/verification.html#background

As of April 2025, Deal doesn't have an issue tracker and doesn't seem to be active.

deal example

#!/usr/bin/env python3
import deal
from typing import Union

@deal.pre(lambda number: number<1E100) # Precondition must be true before the function is executed.
@deal.pre(lambda number: number>-1E100)
@deal.ensure(lambda number, result: result==number+1)
def add_one_to_arg(number: Union[int, float]) -> Union[int, float]:
    """
    do the addition
    """
    return number + 1
#EOF

which can be run using

 python3 -m deal prove name_of_file.py

iContract

No prover, just consistency of conditions within Python using decorators.

https://github.com/Parquery/icontract

https://icontract.readthedocs.io/en/latest/

This report found lack of coverage

https://github.com/mristin/python-by-contract-corpus/blob/main/python_by_contract_corpus/incorrect_from_recorded/aoc2020/day_10_adapter_array/missed_edge_case_empty.py

from https://github.com/mristin/python-by-contract-corpus

Thursday, April 17, 2025

criteria for an impactful and efficient prototype

Prototypes are impactful and efficient when they feature only the essential features. The consequence of that claim is that the prototype is janky (not easy to use), fragile (not robust), shared prematurely (not "professional" looking). For software, a prototype might act on fake data and produce incorrect results.

After stakeholders provide feedback, then the RoI has been confirmed and the direction for where to invest more effort is clarified -- what else is essential? Correctness is typically of interest, but that competes with ease-of-use, speed, and robustness.

Tuesday, January 28, 2025

a sequence of prompts to get an LLM to provide content for the Physics Derivation Graph

In https://github.com/allofphysicsgraph/allofphysics.com/issues/16 I created a set of prompts that got Gemini 2.0 (https://aistudio.google.com/prompts/) to generate content relevant for the Physics Derivation Graph. Probably saved a day or 2 of manual labor per derivation.

The steps below have been vetted in https://github.com/allofphysicsgraph/allofphysics.com/issues/20

Find a relevant derivation

bridge two concepts

In mathematical Physics a derivation is comprised of steps and equations.

Is there a derivation that bridges Newton's Second Law for the Center of Mass to the concepts of torque and moment of inertia which describe rotational motion? If yes, provide the steps of the derivation. Format the answer using Latex as a single file.

Latex is a software system for typesetting documents. LaTeX is widely used for the communication and publication of scientific documents and technical note-taking in many fields, owing partially to its support for complex mathematical notation.

Do not have equations inline with the text. Equations should be separate and inside a math environment for Latex:

\begin{equation}
\end{equation}

Each equation should have a left-hand side and right-hand side. Do specify equations, but avoid using equations that contain multiple equal signs.

Alternative starting point

reformat wikipedia to Latex

Wikipedia uses MediaWiki syntax. Convert the following MediaWiki content to a single Latex document.

Manually clean up the latex using Overleaf

Insert labels for equations.

For the following Latex file, write out the file and insert a label for each equation. The label that is added should be descriptive of the equation and contain no spaces, only underscores.

Labels for equations should be unique within the document.

Don't provide any commentary. Just write the revised Latex file content as your answer.

Make expression references explicit

The following Latex file contains a derivation with many steps. Sometimes a previous expression is referenced implicitly. Edit the following text to manually insert references where appropriate. For example replace text like "substitute the forces" with "substitute Eq.~\ref{} to get". Be sure to reference the relevant equation labels so that Latex can compile the document.

Add more steps

A derivation in mathematical Physics consists of a sequence of steps. Each step relates mathematical equations.

Modify the Latex file below by adding explanatory text about each step of the derivation. Document the mathematical transformations that relate each equation in the file to other equations in the file.

Use the Latex equation labels to reference equations.

Only if needed:

Separate equations

A derivation in mathematical Physics consists of a sequence of steps. Each step relates mathematical equations.

In the Latex file below, for equations that have multiple instances of the equals sign, separate those into multiple equations such that each equation only has one use of the equals sign.

Do not have equations in the text. Equations should be inside

\begin{equation}
\end{equation}

Write out the modified content as a single Latex file.

Add explanatory text between equations

A derivation in mathematical Physics consists of a sequence of steps. Each step relates mathematical equations.

In the Latex file below, where there are two equations with no explanatory text between them, modify the Latex file by adding explanatory text to document the mathematical transformations that relate each equation in the file to other equations in the file.

Use the Latex equation labels to reference equations.

List of variables is last step

list variables used in the derivation with reference

The following latex file contains equations and variables. Provide a list, formatted as Latex, of every unique variable and a description of that variable.

For each entry in the list include a reference to the labeled equations where each variable is used. Write out just the Latex list as your answer.

If you already have a list of variables,

list variables used in the derivation with reference

The following Latex file contains equations and variables. The first section contains a list of all the variables used in the equations. Modify the Latex list to include a reference to the labeled equations where each variable is used. Write out just the modified Latex list as your answer.

Tuesday, March 24, 2026

Thursday, February 19, 2026

Initial conditions are vital for trajectory

Producing code does not imply value

Project health and one-shot prompts

Skill and expertise still matter

Design decisions are significant investments

LLM code is not optimized by default

Review code changes

LLMs are not databases of facts

Saturday, January 31, 2026

Create user

SSH

Updating bash configuration

UFW = Uncomplicated Firewall

Update OS

Install Docker

Enable Upload to GitHub

Clone git repo

Build containers

DNS

Certbot

Wednesday, January 28, 2026

Configuring a VCN (Virtual Cloud Network)

Configuring my First VPS

Result: Oracle doesn't have capacity in the AD

Saturday, January 24, 2026

Netcup

Namecheap

hetzner

AWS S3

AWS LightSail

Google Cloud (GCP)

linode

ioflood

DigitalOcean

Hetzner

Oracle

Google Compute Engine

Microsoft Azure

AWS Lightsail

AWS

Friday, January 16, 2026

Thursday, December 4, 2025

Sunday, November 23, 2025

Verification of a step using a Computer Algebra System like SymPy

Consistency of dimensionality using SymPy

Verification of the step using Lean

Sunday, November 2, 2025

Search for Alternatives

Convert Property Graph to Relational Database for SQL

Design Principles

SQLite Schema Definition

Rationale and Explanation

Saturday, November 1, 2025

Friday, October 17, 2025

Saturday, October 4, 2025

Tuesday, June 17, 2025

Avoid asking factual questions

The more widely documented something is, the better the LLM knows it

Precise questions using relevant jargon with context yields useful output

Do not trust citations

Decompose complex tasks and questions into a sequence of iterative prompts

Structure your question to produce a page or less of output

LLMs default to the average

Simplify your question to a one-shot prompt

Delegation to an intern who doesn't learn

Wednesday, May 7, 2025

Makefile

docker push multiple platforms using buildx

References

Sunday, May 4, 2025

Tuesday, April 22, 2025

Dafny

Deal

iContract

Thursday, April 17, 2025

Tuesday, January 28, 2025

Find a relevant derivation

List of variables is last step

`docker push` multiple platforms using `buildx`