Sunday, May 19, 2024

notes on status and near-term plans

status updates after ~30 hours of work over the course of 1 weekend.

I'm happy with the rewrite of the back-end since it makes the design more robust and flexible. 

I've learned enough Cypher to feel comfortable continuing with this path. I'm also somewhat confident (with no basis in experience) that switching to a different property graph is feasible without having to rewrite the front-end and controller).


Developer workflow

My "how do I develop the code" steps have evolved to include Black and mypy. 

After making changes to the code I format using Black and then use type hinting:
make black_out
docker run --rm -v`pwd`:/scratch --entrypoint='' -w /scratch/ property_graph_webserver mypy --check-untyped-defs webserver/app.py
To launch the web UI,
make black_out; docker ps | grep property | cut -d' ' -f1 | xargs docker kill; date; make up

I added Selenium tests to speed up the UI validation.


Near-term plans

There's a lot to do, so in roughly priority order (from most important near-term to will-get-to-later)
  1. Convert Latex expressions to Sympy
  2. Check the dimensionality and consistency of expressions using sympy
  3. Provide feedback to user on invalid inputs using dimensionality checks and symbol consistency
  4. Check steps using SymPy. Currently in the v7 code base the file "validate_steps_sympy.py" takes Latex as input, converts to SymPy, then validates step. I think the step validation should be done using pure SymPy? (rather than taking Latex as input).

Lean: Explorer how to include lean derivations per step. Can I get lean to run on my computer?

Add API support to enable curl interactions.
This would improve command line testability of workflows

Write more selenium tests.
This assumes the web UI is stable.

Make the HTML tables sortable (as already exists on https://derivationmap.net/)

Support rendering latex in HTML (as already exists on https://derivationmap.net/)

Migrating existing content into the new back end

Convert latex strings into PNG files for visualization (as already exists on https://derivationmap.net/)

Render derivations as PDF (as already exists on https://derivationmap.net/)

Saturday, May 18, 2024

distinguishing scalars, vectors, and matrices as operators or symbols

A Physics derivation has steps and expressions. Expressions are composed of symbols and operations. 

Symbols (e.g., x) can be constant or variable, real or complex. Symbols do not have arguments.

Operations are distinct from symbols because they require one or more symbols to operate upon. For example, +, ^, determinant(), etc.


Within the concept of symbols there are distinct categories: scalar, vector, matrix. The distinction is that they have different properties. A vector has a dimension, a matrix has 2 dimensions. 

Since vectors can contain variables (e.g., \vec{a} = [x,y,z] ) and matrices can contain scalars, then we need a new boolean property for symbols: is_composite, and a new numeric property for symbols: dimensions. 

When "dimension"=0, the symbol is a scalar. Then is_composite is false.
When "dimension"=1, the symbol is a vector. Then is_composite is either false or true. If true then the symbol has two or more edges with other (scalar) symbols.
When "dimension"=2, the symbol is a matrix. Then is_composite is either false or true. If true then the symbol has four or more edges with other (scalar) symbols.


However, this accommodation isn't sufficient. We can say "matrix A = matrix B" and consider the matrices as symbols. However, a quantum operator (e.g., bit flip) is a matrix that operates on a state -- an argument is required. There is an equivalence to the Hamiltonian ("H") and the matrix representation. Therefore the matrix is not a symbol as previously defined.


While the expression "matrix A = matrix B" is valid, " + = + " is not. The difference is that "+" is not a composite symbol.

"H = matrix B" is a valid expression even though "H" is an operator. 

Therefore the distinction between symbol and operators is misleading. The schema for symbols should be

  • latex
  • name_latex
  • description_latex
  • requires_arguments (boolean)
    • if requires_arguments=true (operator: +, /, ^, determinant), then 
      • argument_count (e.g., 1,2,3)
    • else requires_arguments=false (variable or constant or operator: a, x, H), then 
      • dimension (e.g., 0,1,2)
        • if dimension = 0 (aka scalar: "size=1x1" and "is_composite=false") then
          • dimension_length
          • dimension_time
          • dimension_luminosity
          • dimension_mass
        • if dimension = 1 (aka vector) then
          • is_composite = false or true
          • orientation is row xor column xor arbitrary
            • if row or column, 
              • size is arbitrary xor definite
                • if definite, "2x1" xor "1x2" xor "3x1" xor "1x3" ...
        • if dimension = 2 (aka matrix) then
          • is_composite = false or true
          • size arbitrary xor definite
            • if definite, "2x2" xor "2x3" xor "3x2" xor "3x3" xor ...

That means the major distinct subtypes of symbols are
  • symbol that requires arguments
  • symbol that does not require arguments, dimension 0
  • symbol that does not require arguments, dimension 1
  • symbol that does not require arguments, dimension 2

Saturday, March 9, 2024

Derivations, CAS, Lean, and Assumptions in Physics

Initially the Physics Derivation Graph documented expressions as Latex. Then SymPy was added to support validation of steps (is the step self-consistent) and dimensionality (is the expression self-consistent?). 

Recently I learned that Lean could be used to prove each step in a derivation. The difference between a Computer Algebra System (e.g., SymPy) and Lean is whether "a = b  --> a/b = 1" is a valid step -- it isn't when b is zero. Lean catches that; SymPy does not. 

While Lean proofs sound like the last possible refinement, there are two additional complications to account for not addressed by Lean. 

Challenge: Bounded ranges of applicability

In classical mechanics the relation between momentum, mass, and velocity is "p = m v". That hold when "v << c". Near the speed of light we need to switch to relativistic mass, 

m = m_{rest} / sqrt{1-((v^2)/(c^2))}.

The boundary between "v << c" and "v ~ c" is usually set by the context being considered. 

One response for users of Lean would be to always use the "correct" relativistic equation, even when "v << c."  A more conventional approach used by Physicists is to use

p = m v, where v << c

then drop the "v << c" clause and rely on context.


Challenge: Real versus Float versus experimental characterization

Lean forces you to characterize numbers as Real or Integer or Complex. This presents a problem for numerical simulations that have something like a 64 bit float representation.

In thermodynamics we assume the number of particles involved is sufficiently large that we focus on the behavior of the ensemble rather than individual particles. The imprecision of floats is not correct, but neither is the infinite precision assumed by Real numbers. 


Example applications of Lean proofs needing bounds on values

Math doesn't have convenient ways of indicating "finite precision, as set by the Plank scale."  The differential element used in calculus cannot actually go to zero, but we use that concept because it works at the scales we are used to. 

Physicists make simplifying assumptions that sometimes ignore reality (e.g., assuming continuous media when particles are discrete). Then again the assumption that particles are discrete is also a convenient fiction that ignores the wavefunction of quantum mechanics. 

Lean can be used to prove derivations in classical mechanics, but to be explicit about the bounds of those proofs we'd also need to indicate "v << c" and "assume space is Euclidean." 

For molecular dynamics, another constraint to account for is "temperature << 1E10 Kelvin" or whatever temperature the atoms breaks down into plasma. 

Distinguishing the context of (classical mechanics from quantum) and (classical from relativistic) and (conventional gas versus plasma) seems important so that we know when a claim proven in Lean is applicable. 

Saturday, March 2, 2024

dichotomy of assumptions

In Physics there are some assumptions that form a dichotomy:

  • is the speed of light constant or variable?
  • is the measure of energy discrete or continuous?

In the dichotomy of assumptions, one of the two assumptions is reflective of reality, and the other is an oversimplification. The oversimplification is related to reality by assumptions, constraints, and limits. 

(I define "oversimplification" as the extension of useful assumptions to incorrect regions.)

Another case where oversimplification is the link between domains is quantum physics and (classical) statistical physics. Quantum particles are either Fermions (odd half integer spin) or Bosons (integer spin), but that is practically irrelevant for large ensembles of particles at room temperature. The aspects that get measured at one scale (e.g., particle velocity) are related to but separate from metrics at another scale (e.g, temperature, entropy). Mathematically this transition manifests as the switch from summation to integration.


So what? 
This is a new-to-me category of derivations which span domains. What constitutes a domain is set by the assumptions that form the boundaries, and oversimplification is how to cross the boundaries. 

What boundaries should the Physics Derivation Graph transgress? What oversimplifications are adjacent?

The evidences of dissonance (e.g, Mercury’s perihelion based on Newtonian gravitation versus relativity, the Deflection of Starlight; source) are not relevant for bridging domains. They are illustrations of the oversimplification.

Update 2024-03-10: on the page https://en.wikipedia.org/wiki/Phase_space#Quantum_mechanics

"by expressing quantum mechanics in phase space (the same ambit as for classical mechanics), the Weyl map facilitates recognition of quantum mechanics as a deformation (generalization) of classical mechanics, with deformation parameter ħ/S, where S is the action of the relevant process. (Other familiar deformations in physics involve the deformation of classical Newtonian into relativistic mechanics, with deformation parameter v/c; or the deformation of Newtonian gravity into general relativity, with deformation parameter Schwarzschild radius/characteristic dimension.)
 
Classical expressions, observables, and operations (such as Poisson brackets) are modified by ħ-dependent quantum corrections, as the conventional commutative multiplication applying in classical mechanics is generalized to the noncommutative star-multiplication characterizing quantum mechanics and underlying its uncertainty principle."
See also https://en.wikipedia.org/wiki/Wigner%E2%80%93Weyl_transform#Deformation_quantization

Sunday, February 25, 2024

derivations, identities, and other categories of mathematical physics

Transforming from one expression to another is carried out with respect to a goal. There are different categories of goals (listed below). I don't yet know what distinguishes one category from another.

My motivation for the Physics Derivation Graph is exclusively the first category -- derivations, specifically across domains. This motivation is more specific than a broader question, "what is the relation between every expression in mathematical physics and every other expression?" because the answer might be "symbols and inference rules are common."

The other categories are neat but not as rewarding for me. These other categories are included in the PDG because the infrastructure for software and UI and data format are the same.


Examples of Derivations

Relation between two disparate descriptions of nature:

Derivation of expression from simpler observations plus assumptions:
An expression in one domain simplifies to another expression in a different domain under modified assumptions

Examples of Identities

An equation can be expressed differently but be equivalent:
Show that two sides of an equation are, in fact, the same:

Examples of Use Case

Start from one or more expressions to derive a conclusion with practical utility
This is the focus of the ATOMS lab at UMBC, specifically enacted using Lean.

Sunday, January 21, 2024

Python dependencies visualization using pydeps

To see what Python modules are available on my system I can run

pydoc3 modules

One of the modules is pandas. Here's a script that just loads pandas module:

cat load_pandas.py 
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
import pandas as pd 

When the script is run nothing happens

python3 load_pandas.py
I have pydeps installed:
pydeps --version
pydeps v1.12.17

Output of analysis by pydeps:

pydeps -o myfile.png --show-dot -T png --noshow load_pandas.py 

digraph G {
    concentrate = true;

    rankdir = TB;
    node [style=filled,fillcolor="#ffffff",fontcolor="#000000",fontname=Helvetica,fontsize=10];

    load_pandas_py [fillcolor="#a65959",fontcolor="#ffffff",label="load_pandas.py"];
    pandas [fillcolor="#039595",fontcolor="#ffffff"];
    pandas__config [fillcolor="#24d0d0",label="pandas._config"];
    pandas__version [fillcolor="#47c2c2",label="pandas\.\n_version"];
    pandas__version_meson [fillcolor="#47c2c2",label="pandas\.\n_version_meson"];
    pandas_api [fillcolor="#3db8b8",label="pandas.api"];
    pandas_arrays [fillcolor="#46a4a4",label="pandas.arrays"];
    pandas_compat [fillcolor="#17d3d3",label="pandas.compat"];
    pandas_compat__optional [fillcolor="#31c4c4",label="pandas\.\ncompat\.\n_optional"];
    pandas_compat_pyarrow [fillcolor="#3db8b8",label="pandas\.\ncompat\.\npyarrow"];
    pandas_core [fillcolor="#10f9f9",label="pandas.core"];
    pandas_core_api [fillcolor="#3a8888",fontcolor="#ffffff",label="pandas\.\ncore\.\napi"];
    pandas_core_computation [fillcolor="#3bcece",label="pandas\.\ncore\.\ncomputation"];
    pandas_core_computation_api [fillcolor="#46a4a4",label="pandas\.\ncore\.\ncomputation\.\napi"];
    pandas_core_config_init [fillcolor="#3a8888",fontcolor="#ffffff",label="pandas\.\ncore\.\nconfig_init"];
    pandas_core_dtypes [fillcolor="#2fdbdb",label="pandas\.\ncore\.\ndtypes"];
    pandas_core_dtypes_dtypes [fillcolor="#339999",fontcolor="#ffffff",label="pandas\.\ncore\.\ndtypes\.\ndtypes"];
    pandas_core_reshape [fillcolor="#47c2c2",label="pandas\.\ncore\.\nreshape"];
    pandas_core_reshape_api [fillcolor="#46a4a4",label="pandas\.\ncore\.\nreshape\.\napi"];
    pandas_errors [fillcolor="#3ab0b0",label="pandas.errors"];
    pandas_io [fillcolor="#0bdfdf",label="pandas.io"];
    pandas_io_api [fillcolor="#46a4a4",label="pandas.io.api"];
    pandas_io_json [fillcolor="#23c8c8",label="pandas.io.json"];
    pandas_io_json__normalize [fillcolor="#4cb3b3",label="pandas\.\nio\.\njson\.\n_normalize"];
    pandas_plotting [fillcolor="#31c4c4",label="pandas\.\nplotting"];
    pandas_testing [fillcolor="#4cb3b3",label="pandas.testing"];
    pandas_tseries [fillcolor="#23c8c8",label="pandas.tseries"];
    pandas_tseries_api [fillcolor="#46a4a4",label="pandas\.\ntseries\.\napi"];
    pandas_tseries_offsets [fillcolor="#26d9d9",label="pandas\.\ntseries\.\noffsets"];
    pandas_util [fillcolor="#05e5e5",label="pandas.util"];
    pandas_util__print_versions [fillcolor="#409696",fontcolor="#ffffff",label="pandas\.\nutil\.\n_print_versions"];
    pandas_util__tester [fillcolor="#46a4a4",label="pandas\.\nutil\.\n_tester"];
    pandas -> load_pandas_py [fillcolor="#039595",minlen="2"];
    pandas__config -> pandas [fillcolor="#24d0d0"];
    pandas__config -> pandas_core_config_init [fillcolor="#24d0d0",minlen="2"];
    pandas__config -> pandas_errors [fillcolor="#24d0d0"];
    pandas__version -> pandas [fillcolor="#47c2c2"];
    pandas__version -> pandas_util__print_versions [fillcolor="#47c2c2",minlen="2"];
    pandas__version_meson -> pandas [fillcolor="#47c2c2"];
    pandas__version_meson -> pandas_util__print_versions [fillcolor="#47c2c2",minlen="2"];
    pandas_api -> pandas [fillcolor="#3db8b8"];
    pandas_arrays -> pandas [fillcolor="#46a4a4"];
    pandas_compat -> pandas [fillcolor="#17d3d3"];
    pandas_compat -> pandas_core_dtypes_dtypes [fillcolor="#17d3d3",minlen="3"];
    pandas_compat -> pandas_util__print_versions [fillcolor="#17d3d3",minlen="2"];
    pandas_compat -> pandas_util__tester [fillcolor="#17d3d3",minlen="2"];
    pandas_compat__optional -> pandas [fillcolor="#31c4c4",minlen="2"];
    pandas_compat__optional -> pandas_util__print_versions [fillcolor="#31c4c4",minlen="2"];
    pandas_compat__optional -> pandas_util__tester [fillcolor="#31c4c4",minlen="2"];
    pandas_compat_pyarrow -> pandas [fillcolor="#3db8b8",minlen="2"];
    pandas_compat_pyarrow -> pandas_compat [fillcolor="#3db8b8",weight="2"];
    pandas_core -> pandas [fillcolor="#10f9f9"];
    pandas_core -> pandas_arrays [fillcolor="#10f9f9"];
    pandas_core -> pandas_util [fillcolor="#10f9f9"];
    pandas_core_api -> pandas [fillcolor="#3a8888",minlen="2"];
    pandas_core_computation -> pandas [fillcolor="#3bcece",minlen="2"];
    pandas_core_computation -> pandas_core_config_init [fillcolor="#3bcece",weight="2"];
    pandas_core_computation_api -> pandas [fillcolor="#46a4a4",minlen="3"];
    pandas_core_config_init -> pandas [fillcolor="#3a8888",minlen="2"];
    pandas_core_dtypes -> pandas [fillcolor="#2fdbdb",minlen="2"];
    pandas_core_dtypes -> pandas_core_api [fillcolor="#2fdbdb",weight="2"];
    pandas_core_dtypes -> pandas_core_config_init [fillcolor="#2fdbdb",weight="2"];
    pandas_core_dtypes_dtypes -> pandas [fillcolor="#339999",minlen="3"];
    pandas_core_dtypes_dtypes -> pandas_core_api [fillcolor="#339999",minlen="2",weight="2"];
    pandas_core_reshape -> pandas [fillcolor="#47c2c2",minlen="2"];
    pandas_core_reshape_api -> pandas [fillcolor="#46a4a4",minlen="3"];
    pandas_errors -> pandas [fillcolor="#3ab0b0"];
    pandas_errors -> pandas_core_dtypes_dtypes [fillcolor="#3ab0b0",minlen="3"];
    pandas_io -> pandas [fillcolor="#0bdfdf"];
    pandas_io -> pandas_core_api [fillcolor="#0bdfdf",minlen="2"];
    pandas_io -> pandas_core_config_init [fillcolor="#0bdfdf",minlen="2"];
    pandas_io_api -> pandas [fillcolor="#46a4a4",minlen="2"];
    pandas_io_json -> pandas [fillcolor="#23c8c8",minlen="2"];
    pandas_io_json -> pandas_io [fillcolor="#23c8c8",weight="2"];
    pandas_io_json -> pandas_io_api [fillcolor="#23c8c8",weight="2"];
    pandas_io_json__normalize -> pandas [fillcolor="#4cb3b3",minlen="3"];
    pandas_plotting -> pandas [fillcolor="#31c4c4"];
    pandas_plotting -> pandas_core_config_init [fillcolor="#31c4c4",minlen="2"];
    pandas_testing -> pandas [fillcolor="#4cb3b3"];
    pandas_tseries -> pandas [fillcolor="#23c8c8"];
    pandas_tseries -> pandas_core_api [fillcolor="#23c8c8",minlen="2"];
    pandas_tseries_api -> pandas [fillcolor="#46a4a4",minlen="2"];
    pandas_tseries_offsets -> pandas [fillcolor="#26d9d9",minlen="2"];
    pandas_tseries_offsets -> pandas_core_api [fillcolor="#26d9d9",minlen="2"];
    pandas_tseries_offsets -> pandas_tseries [fillcolor="#26d9d9",weight="2"];
    pandas_tseries_offsets -> pandas_tseries_api [fillcolor="#26d9d9",weight="2"];
    pandas_util -> pandas [fillcolor="#05e5e5"];
    pandas_util -> pandas_arrays [fillcolor="#05e5e5"];
    pandas_util -> pandas_compat__optional [fillcolor="#05e5e5",minlen="2"];
    pandas_util -> pandas_compat_pyarrow [fillcolor="#05e5e5",minlen="2"];
    pandas_util -> pandas_core_dtypes_dtypes [fillcolor="#05e5e5",minlen="3"];
    pandas_util -> pandas_errors [fillcolor="#05e5e5"];
    pandas_util__print_versions -> pandas [fillcolor="#409696",minlen="2"];
    pandas_util__tester -> pandas [fillcolor="#46a4a4",minlen="2"];
}

Tuesday, December 19, 2023

LLM that includes the concept of inference rules

Question: What's the difference between a plain old search engine and an LLM+RAG?
Answer: LLM+RAG provides an experience like semantic search capability plus synthesis but without the need for semantic tagging on the front-end or the back-end. 
[https://www.sbert.net/examples/applications/semantic-search/README.html#semantic-search]

Relevance to the Physics Derivation Graph: add the following to an existing large language model (LLM)

  • the list of inference rules for the Physics Derivation Graph
  • examples of Latex-to-Sage conversion
  • example Lean4 proofs
"fine tuning" versus "context provision"

How is "context provision" different from RAG?

What's the difference between a transformer and a model?

specifically



output of help for llama.cpp

docker run -it --rm  -v `pwd`:/scratch llama-cpp-with-mistral-7b-v0.1.q6_k:2023-12-22 /bin/bash 
root@dc98ac4a23d5:/opt/llama.cpp# ./main -h

usage: ./main [options]

options:
  -h, --help            show this help message and exit
      --version         show version and build info
  -i, --interactive     run in interactive mode
  --interactive-first   run in interactive mode and wait for input right away
  -ins, --instruct      run in instruction mode (use with Alpaca models)
  -cml, --chatml        run in chatml mode (use with ChatML-compatible models)
  --multiline-input     allows you to write or paste multiple lines without ending each in '\'
  -r PROMPT, --reverse-prompt PROMPT
                        halt generation at PROMPT, return control in interactive mode
                        (can be specified more than once for multiple prompts).
  --color               colorise output to distinguish prompt and user input from generations
  -s SEED, --seed SEED  RNG seed (default: -1, use random seed for < 0)
  -t N, --threads N     number of threads to use during generation (default: 20)
  -tb N, --threads-batch N
                        number of threads to use during batch and prompt processing (default: same as --threads)
  -p PROMPT, --prompt PROMPT
                        prompt to start generation with (default: empty)
  -e, --escape          process prompt escapes sequences (\n, \r, \t, \', \", \\)
  --prompt-cache FNAME  file to cache prompt state for faster startup (default: none)
  --prompt-cache-all    if specified, saves user input and generations to cache as well.
                        not supported with --interactive or other interactive options
  --prompt-cache-ro     if specified, uses the prompt cache but does not update it.
  --random-prompt       start with a randomized prompt.
  --in-prefix-bos       prefix BOS to user inputs, preceding the `--in-prefix` string
  --in-prefix STRING    string to prefix user inputs with (default: empty)
  --in-suffix STRING    string to suffix after user inputs with (default: empty)
  -f FNAME, --file FNAME
                        prompt file to start generation.
  -n N, --n-predict N   number of tokens to predict (default: -1, -1 = infinity, -2 = until context filled)
  -c N, --ctx-size N    size of the prompt context (default: 512, 0 = loaded from model)
  -b N, --batch-size N  batch size for prompt processing (default: 512)
  --samplers            samplers that will be used for generation in the order, separated by ';', for example: "top_k;tfs;typical;top_p;min_p;temp"
  --sampling-seq        simplified sequence for samplers that will be used (default: kfypmt)
  --top-k N             top-k sampling (default: 40, 0 = disabled)
  --top-p N             top-p sampling (default: 0.9, 1.0 = disabled)
  --min-p N             min-p sampling (default: 0.1, 0.0 = disabled)
  --tfs N               tail free sampling, parameter z (default: 1.0, 1.0 = disabled)
  --typical N           locally typical sampling, parameter p (default: 1.0, 1.0 = disabled)
  --repeat-last-n N     last n tokens to consider for penalize (default: 64, 0 = disabled, -1 = ctx_size)
  --repeat-penalty N    penalize repeat sequence of tokens (default: 1.1, 1.0 = disabled)
  --presence-penalty N  repeat alpha presence penalty (default: 0.0, 0.0 = disabled)
  --frequency-penalty N repeat alpha frequency penalty (default: 0.0, 0.0 = disabled)
  --mirostat N          use Mirostat sampling.
                        Top K, Nucleus, Tail Free and Locally Typical samplers are ignored if used.
                        (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
  --mirostat-lr N       Mirostat learning rate, parameter eta (default: 0.1)
  --mirostat-ent N      Mirostat target entropy, parameter tau (default: 5.0)
  -l TOKEN_ID(+/-)BIAS, --logit-bias TOKEN_ID(+/-)BIAS
                        modifies the likelihood of token appearing in the completion,
                        i.e. `--logit-bias 15043+1` to increase likelihood of token ' Hello',
                        or `--logit-bias 15043-1` to decrease likelihood of token ' Hello'
  --grammar GRAMMAR     BNF-like grammar to constrain generations (see samples in grammars/ dir)
  --grammar-file FNAME  file to read grammar from
  --cfg-negative-prompt PROMPT
                        negative prompt to use for guidance. (default: empty)
  --cfg-negative-prompt-file FNAME
                        negative prompt file to use for guidance. (default: empty)
  --cfg-scale N         strength of guidance (default: 1.000000, 1.0 = disable)
  --rope-scaling {none,linear,yarn}
                        RoPE frequency scaling method, defaults to linear unless specified by the model
  --rope-scale N        RoPE context scaling factor, expands context by a factor of N
  --rope-freq-base N    RoPE base frequency, used by NTK-aware scaling (default: loaded from model)
  --rope-freq-scale N   RoPE frequency scaling factor, expands context by a factor of 1/N
  --yarn-orig-ctx N     YaRN: original context size of model (default: 0 = model training context size)
  --yarn-ext-factor N   YaRN: extrapolation mix factor (default: 1.0, 0.0 = full interpolation)
  --yarn-attn-factor N  YaRN: scale sqrt(t) or attention magnitude (default: 1.0)
  --yarn-beta-slow N    YaRN: high correction dim or alpha (default: 1.0)
  --yarn-beta-fast N    YaRN: low correction dim or beta (default: 32.0)
  --ignore-eos          ignore end of stream token and continue generating (implies --logit-bias 2-inf)
  --no-penalize-nl      do not penalize newline token
  --temp N              temperature (default: 0.8)
  --logits-all          return logits for all tokens in the batch (default: disabled)
  --hellaswag           compute HellaSwag score over random tasks from datafile supplied with -f
  --hellaswag-tasks N   number of tasks to use when computing the HellaSwag score (default: 400)
  --keep N              number of tokens to keep from the initial prompt (default: 0, -1 = all)
  --draft N             number of tokens to draft for speculative decoding (default: 8)
  --chunks N            max number of chunks to process (default: -1, -1 = all)
  -np N, --parallel N   number of parallel sequences to decode (default: 1)
  -ns N, --sequences N  number of sequences to decode (default: 1)
  -pa N, --p-accept N   speculative decoding accept probability (default: 0.5)
  -ps N, --p-split N    speculative decoding split probability (default: 0.1)
  -cb, --cont-batching  enable continuous batching (a.k.a dynamic batching) (default: disabled)
  --mmproj MMPROJ_FILE  path to a multimodal projector file for LLaVA. see examples/llava/README.md
  --image IMAGE_FILE    path to an image file. use with multimodal models
  --mlock               force system to keep model in RAM rather than swapping or compressing
  --no-mmap             do not memory-map model (slower load but may reduce pageouts if not using mlock)
  --numa                attempt optimizations that help on some NUMA systems
                        if run without this previously, it is recommended to drop the system page cache before using this
                        see https://github.com/ggerganov/llama.cpp/issues/1437
  --verbose-prompt      print prompt before generation
  -dkvc, --dump-kv-cache
                        verbose print of the KV cache
  -nkvo, --no-kv-offload
                        disable KV offload
  -ctk TYPE, --cache-type-k TYPE
                        KV cache data type for K (default: f16)
  -ctv TYPE, --cache-type-v TYPE
                        KV cache data type for V (default: f16)
  --simple-io           use basic IO for better compatibility in subprocesses and limited consoles
  --lora FNAME          apply LoRA adapter (implies --no-mmap)
  --lora-scaled FNAME S apply LoRA adapter with user defined scaling S (implies --no-mmap)
  --lora-base FNAME     optional model to use as a base for the layers modified by the LoRA adapter
  -m FNAME, --model FNAME
                        model path (default: models/7B/ggml-model-f16.gguf)
  -md FNAME, --model-draft FNAME
                        draft model for speculative decoding
  -ld LOGDIR, --logdir LOGDIR
                        path under which to save YAML logs (no logging if unset)
  --override-kv KEY=TYPE:VALUE
                        advanced option to override model metadata by key. may be specified multiple times.
                        types: int, float, bool. example: --override-kv tokenizer.ggml.add_bos_token=bool:false

log options:
  --log-test            Run simple logging test
  --log-disable         Disable trace logs
  --log-enable          Enable trace logs
  --log-file            Specify a log filename (without extension)
  --log-new             Create a separate new log file on start. Each log file will have unique name: "<name>.<ID>.log"
  --log-append          Don't truncate the old log file.