Tuesday, December 19, 2023

LLM that includes the concept of inference rules

Question: What's the difference between a plain old search engine and an LLM+RAG?
Answer: LLM+RAG provides an experience like semantic search capability plus synthesis but without the need for semantic tagging on the front-end or the back-end. 
[https://www.sbert.net/examples/applications/semantic-search/README.html#semantic-search]

Relevance to the Physics Derivation Graph: add the following to an existing large language model (LLM)

  • the list of inference rules for the Physics Derivation Graph
  • examples of Latex-to-Sage conversion
  • example Lean4 proofs
"fine tuning" versus "context provision"

How is "context provision" different from RAG?

What's the difference between a transformer and a model?

specifically



output of help for llama.cpp

docker run -it --rm  -v `pwd`:/scratch llama-cpp-with-mistral-7b-v0.1.q6_k:2023-12-22 /bin/bash 
root@dc98ac4a23d5:/opt/llama.cpp# ./main -h

usage: ./main [options]

options:
  -h, --help            show this help message and exit
      --version         show version and build info
  -i, --interactive     run in interactive mode
  --interactive-first   run in interactive mode and wait for input right away
  -ins, --instruct      run in instruction mode (use with Alpaca models)
  -cml, --chatml        run in chatml mode (use with ChatML-compatible models)
  --multiline-input     allows you to write or paste multiple lines without ending each in '\'
  -r PROMPT, --reverse-prompt PROMPT
                        halt generation at PROMPT, return control in interactive mode
                        (can be specified more than once for multiple prompts).
  --color               colorise output to distinguish prompt and user input from generations
  -s SEED, --seed SEED  RNG seed (default: -1, use random seed for < 0)
  -t N, --threads N     number of threads to use during generation (default: 20)
  -tb N, --threads-batch N
                        number of threads to use during batch and prompt processing (default: same as --threads)
  -p PROMPT, --prompt PROMPT
                        prompt to start generation with (default: empty)
  -e, --escape          process prompt escapes sequences (\n, \r, \t, \', \", \\)
  --prompt-cache FNAME  file to cache prompt state for faster startup (default: none)
  --prompt-cache-all    if specified, saves user input and generations to cache as well.
                        not supported with --interactive or other interactive options
  --prompt-cache-ro     if specified, uses the prompt cache but does not update it.
  --random-prompt       start with a randomized prompt.
  --in-prefix-bos       prefix BOS to user inputs, preceding the `--in-prefix` string
  --in-prefix STRING    string to prefix user inputs with (default: empty)
  --in-suffix STRING    string to suffix after user inputs with (default: empty)
  -f FNAME, --file FNAME
                        prompt file to start generation.
  -n N, --n-predict N   number of tokens to predict (default: -1, -1 = infinity, -2 = until context filled)
  -c N, --ctx-size N    size of the prompt context (default: 512, 0 = loaded from model)
  -b N, --batch-size N  batch size for prompt processing (default: 512)
  --samplers            samplers that will be used for generation in the order, separated by ';', for example: "top_k;tfs;typical;top_p;min_p;temp"
  --sampling-seq        simplified sequence for samplers that will be used (default: kfypmt)
  --top-k N             top-k sampling (default: 40, 0 = disabled)
  --top-p N             top-p sampling (default: 0.9, 1.0 = disabled)
  --min-p N             min-p sampling (default: 0.1, 0.0 = disabled)
  --tfs N               tail free sampling, parameter z (default: 1.0, 1.0 = disabled)
  --typical N           locally typical sampling, parameter p (default: 1.0, 1.0 = disabled)
  --repeat-last-n N     last n tokens to consider for penalize (default: 64, 0 = disabled, -1 = ctx_size)
  --repeat-penalty N    penalize repeat sequence of tokens (default: 1.1, 1.0 = disabled)
  --presence-penalty N  repeat alpha presence penalty (default: 0.0, 0.0 = disabled)
  --frequency-penalty N repeat alpha frequency penalty (default: 0.0, 0.0 = disabled)
  --mirostat N          use Mirostat sampling.
                        Top K, Nucleus, Tail Free and Locally Typical samplers are ignored if used.
                        (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
  --mirostat-lr N       Mirostat learning rate, parameter eta (default: 0.1)
  --mirostat-ent N      Mirostat target entropy, parameter tau (default: 5.0)
  -l TOKEN_ID(+/-)BIAS, --logit-bias TOKEN_ID(+/-)BIAS
                        modifies the likelihood of token appearing in the completion,
                        i.e. `--logit-bias 15043+1` to increase likelihood of token ' Hello',
                        or `--logit-bias 15043-1` to decrease likelihood of token ' Hello'
  --grammar GRAMMAR     BNF-like grammar to constrain generations (see samples in grammars/ dir)
  --grammar-file FNAME  file to read grammar from
  --cfg-negative-prompt PROMPT
                        negative prompt to use for guidance. (default: empty)
  --cfg-negative-prompt-file FNAME
                        negative prompt file to use for guidance. (default: empty)
  --cfg-scale N         strength of guidance (default: 1.000000, 1.0 = disable)
  --rope-scaling {none,linear,yarn}
                        RoPE frequency scaling method, defaults to linear unless specified by the model
  --rope-scale N        RoPE context scaling factor, expands context by a factor of N
  --rope-freq-base N    RoPE base frequency, used by NTK-aware scaling (default: loaded from model)
  --rope-freq-scale N   RoPE frequency scaling factor, expands context by a factor of 1/N
  --yarn-orig-ctx N     YaRN: original context size of model (default: 0 = model training context size)
  --yarn-ext-factor N   YaRN: extrapolation mix factor (default: 1.0, 0.0 = full interpolation)
  --yarn-attn-factor N  YaRN: scale sqrt(t) or attention magnitude (default: 1.0)
  --yarn-beta-slow N    YaRN: high correction dim or alpha (default: 1.0)
  --yarn-beta-fast N    YaRN: low correction dim or beta (default: 32.0)
  --ignore-eos          ignore end of stream token and continue generating (implies --logit-bias 2-inf)
  --no-penalize-nl      do not penalize newline token
  --temp N              temperature (default: 0.8)
  --logits-all          return logits for all tokens in the batch (default: disabled)
  --hellaswag           compute HellaSwag score over random tasks from datafile supplied with -f
  --hellaswag-tasks N   number of tasks to use when computing the HellaSwag score (default: 400)
  --keep N              number of tokens to keep from the initial prompt (default: 0, -1 = all)
  --draft N             number of tokens to draft for speculative decoding (default: 8)
  --chunks N            max number of chunks to process (default: -1, -1 = all)
  -np N, --parallel N   number of parallel sequences to decode (default: 1)
  -ns N, --sequences N  number of sequences to decode (default: 1)
  -pa N, --p-accept N   speculative decoding accept probability (default: 0.5)
  -ps N, --p-split N    speculative decoding split probability (default: 0.1)
  -cb, --cont-batching  enable continuous batching (a.k.a dynamic batching) (default: disabled)
  --mmproj MMPROJ_FILE  path to a multimodal projector file for LLaVA. see examples/llava/README.md
  --image IMAGE_FILE    path to an image file. use with multimodal models
  --mlock               force system to keep model in RAM rather than swapping or compressing
  --no-mmap             do not memory-map model (slower load but may reduce pageouts if not using mlock)
  --numa                attempt optimizations that help on some NUMA systems
                        if run without this previously, it is recommended to drop the system page cache before using this
                        see https://github.com/ggerganov/llama.cpp/issues/1437
  --verbose-prompt      print prompt before generation
  -dkvc, --dump-kv-cache
                        verbose print of the KV cache
  -nkvo, --no-kv-offload
                        disable KV offload
  -ctk TYPE, --cache-type-k TYPE
                        KV cache data type for K (default: f16)
  -ctv TYPE, --cache-type-v TYPE
                        KV cache data type for V (default: f16)
  --simple-io           use basic IO for better compatibility in subprocesses and limited consoles
  --lora FNAME          apply LoRA adapter (implies --no-mmap)
  --lora-scaled FNAME S apply LoRA adapter with user defined scaling S (implies --no-mmap)
  --lora-base FNAME     optional model to use as a base for the layers modified by the LoRA adapter
  -m FNAME, --model FNAME
                        model path (default: models/7B/ggml-model-f16.gguf)
  -md FNAME, --model-draft FNAME
                        draft model for speculative decoding
  -ld LOGDIR, --logdir LOGDIR
                        path under which to save YAML logs (no logging if unset)
  --override-kv KEY=TYPE:VALUE
                        advanced option to override model metadata by key. may be specified multiple times.
                        types: int, float, bool. example: --override-kv tokenizer.ggml.add_bos_token=bool:false

log options:
  --log-test            Run simple logging test
  --log-disable         Disable trace logs
  --log-enable          Enable trace logs
  --log-file            Specify a log filename (without extension)
  --log-new             Create a separate new log file on start. Each log file will have unique name: "<name>.<ID>.log"
  --log-append          Don't truncate the old log file.

Saturday, December 16, 2023

example interaction with ChatGPT 3.5 to produce a Lean proof

In a new conversation I provided the following prompt:
prompt

Answer the question based on the context below. Let's work this out in a step by step way to be sure we have the right answer.

Context: Lean is a proof assistant and programming language. Lean is based on the calculus of constructions with inductive types.

Context: There are multiple versions of Lean. This question is about Lean 4

`a=b` is a proposition. The variable `a` is Real valued and `b` is Real valued. Prove in version 4 of Lean that `(a=b) -> (a+2=b+2)`

ChatGTP 3.5 wrote in response
ChatGPT 3.5 response

Certainly! To prove the statement `(a = b) -> (a + 2 = b + 2)` in version 4 of Lean, you can follow these steps:

1. **Introduction of Variables**: Begin by introducing the variables and assumptions using the `variables` keyword. In this case, you have the variables `a` and `b` both being Real valued.

```lean
variables {a b : ℝ}
```

2. **Assumption**: State the assumption that `(a = b)`.

```lean
example (h : a = b) : a + 2 = b + 2 :=
```

3. **Transformation**: Use the assumption to transform the left-hand side of the conclusion `(a + 2)` to `(b + 2)` using the property of equality.

```lean
calc
   a + 2 = b + 2 : by rw h
```

Here, `calc` is used to chain together a sequence of expressions, and `rw h` applies the assumption `h`.

4. **QED (Quod Erat Demonstrandum)**: Conclude the proof.

```lean
   example (h : a = b) : a + 2 = b + 2 :=
   begin
     calc
       a + 2 = b + 2 : by rw h
   end
```

This step-by-step approach uses the Lean proof style to demonstrate the validity of the given proposition. The `calc` tactic is particularly useful for chaining together equalities.

On my computer I created a file "second_chatGPT_attempt.lean" and wrote
variables {a b : ℝ}

example (h : a = b) : a + 2 = b + 2 :=
begin
  calc
    a + 2 = b + 2 : by rw h
end
and then ran
docker run -it --rm -v `pwd`:/scratch --workdir /scratch lean3onubuntu lean --run second_chatGPT_attempt.lean
second_chatGPT_attempt.lean:2:0: error: unexpected identifier; expected command
second_chatGPT_attempt.lean:5:0: error: unknown identifier 'begin'
second_chatGPT_attempt.lean:6:2: error: unexpected token 'calc'; expected command
second_chatGPT_attempt.lean:8:0: error: invalid 'end', insufficient scopes
I then told ChatGPT
prompt

When I run the script

```lean
variables {a b : ℝ}

example (h : a = b) : a + 2 = b + 2 :=
begin
  calc
    a + 2 = b + 2 : by rw h
end
```

Lean 4 returns an error message: `error: unexpected identifier; expected command`

What should I do to fix the error?

ChatGPT 3.5 response

Apologies for the oversight. It seems I made an error in my previous responses. In Lean, you should use `theorem` or `example` as part of the proof declaration. Here's the corrected version:

```lean
variables {a b : ℝ}

theorem add_two_eq_add_two (h : a = b) : a + 2 = b + 2 :=
begin
  calc
    a + 2 = b + 2 : by rw h
end
```

In this corrected version, I replaced `example` with `theorem`. You can use `example` as well, but `theorem` is more appropriate when proving statements. This should resolve the error you encountered.

I then created a second Lean script, third_chatGPT_attempt.lean, and wrote
variables {a b : ℝ}

theorem add_two_eq_add_two (h : a = b) : a + 2 = b + 2 :=
begin
  calc
    a + 2 = b + 2 : by rw h
end
which got the same error as before when I attempted to run the script using Lean 4:
docker run -it --rm -v `pwd`:/scratch --workdir /scratch lean3onubuntu lean --run third_chatGPT_attempt.lean 
third_chatGPT_attempt.lean:1:0: error: unexpected identifier; expected command
third_chatGPT_attempt.lean:4:0: error: unknown identifier 'begin'
third_chatGPT_attempt.lean:5:2: error: unexpected token 'calc'; expected command
third_chatGPT_attempt.lean:7:0: error: invalid 'end', insufficient scopes
  

LLM prompts for the Physics Derivation Graph

Posing a prompt that gets a useful result currently requires some consideration. Below are some possible tasks for LLMs, along with additional context for the LLM.

Convert Latex to SymPy

TODO:

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer.

Context: Latex is software for document preparation to generate PDF files. Mathematical equations can be expressed in Latex using markup syntax. 

Context: SymPy is an open-source Python library for symbolic computation. SymPy provides a computer algebra system. SymPy can convert Latex math to SymPy, and SymPy can render a mathematical expression as Latex.

Question: What is the SymPy representation for the Latex expression $x^2 + y^2 = 1$ ?

Answer:

Right response:

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer.

Context: Latex is software for document preparation to generate PDF files. Mathematical equations can be expressed in Latex using markup syntax. 

Context: SymPy is an open-source Python library for symbolic computation. SymPy provides a computer algebra system. SymPy can convert Latex math to SymPy, and SymPy can render a mathematical expression as Latex.

Question: What is the SymPy representation for the Latex expression $\vec{p}_{electron} = \vec{p}_{1}-\vec{p}_{2}$ ?

Answer:

Right response:

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer.

Context: Latex is software for document preparation to generate PDF files. Mathematical equations can be expressed in Latex using markup syntax. 

Context: SymPy is an open-source Python library for symbolic computation. SymPy provides a computer algebra system. SymPy can convert Latex math to SymPy, and SymPy can render a mathematical expression as Latex.

Question: What is the SymPy representation for the Latex expression $x = \langle\psi_{\alpha}| \hat{A} |\psi_{\beta}\rangle$ ?

Answer:

Right response:

Specify the mathematical relation between period and frequency as an equation in Latex

TODO:

Caveat: the page https://en.wikipedia.org/wiki/Frequency includes the statement that

"period is the reciprocal of the frequency: f = 1/T."

Use the context below to produce a result. Keep the response short and concise. Respond "Unsure about answer" if not sure about the answer.

Context: Latex is software for document preparation to generate PDF files. Mathematical equations can be expressed in Latex using markup syntax. 

Context: Period (symbol T) is the interval of time between events. Frequency (symbol f) is the number of occurrences of a repeating event per unit of time

Specify the mathematical relation between period and frequency as an equation in Latex.

Find arxiv papers with derivations

TODO:

  • Explain arxiv
  • define what I mean by a derivation
  • Provide example citations

Provide citations based on the context below. 

Context: arxiv is an open-access repository of electronic preprints

Context: a derivation in mathematical Physics consists of a sequence of steps. Each step relates mathematical expressions to an inference rule. An expression is comprised of symbols and operators. An inference rule typically transforms input expressions into output expressions.

Cite three papers from arxiv that contain mathematical derivations with more than four steps.

Identify derivation steps between physics equations

TODO:

  • define what I mean by a derivation
  • Provide example steps

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer. Let's work this out in a step by step way to be sure we have the right answer.

Context: Latex is software for document preparation to generate PDF files. Mathematical equations can be expressed in Latex using markup syntax. 

Context: a derivation in mathematical Physics consists of a sequence of steps. Each step relates mathematical expressions to an inference rule. An expression is comprised of symbols and operators. An inference rule typically transforms input expressions into output expressions.

Question: What mathematical steps relate the Latex math expression $i x = log(y)$ and $\exp(i x) = y$ ?

Answer:

Right answer: Raise both sides as the power of $\exp$

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer. Let's work this out in a step by step way to be sure we have the right answer.

Context: Latex is software for document preparation to generate PDF files. Mathematical equations can be expressed in Latex using markup syntax. 

Context: a derivation in mathematical Physics consists of a sequence of steps. Each step relates mathematical expressions to an inference rule. An expression is comprised of symbols and operators. An inference rule typically transforms input expressions into output expressions.

Question: What is the derivative of $y = \cos(x) + i \sin(x)$ with respect to $x$ ?

Answer:

Right answer: $\frac{d}{dx} y = -\sin(x) + i\cos(x)$

Derive the wave function for a quantum particle in a 1D box

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer. Let's work this out in a step by step way to be sure we have the right answer.

Context: Latex is software for document preparation to generate PDF files. Mathematical equations can be expressed in Latex using markup syntax. 

Context: a derivation in mathematical Physics consists of a sequence of steps. Each step relates mathematical expressions to an inference rule. An expression is comprised of symbols and operators. An inference rule typically transforms input expressions into output expressions.

Question: derive the wave function for a quantum particle in a 1D box

Answer:

Right answer: see https://derivationmap.net/review_derivation/000010/

Derive Newton's Law of Universal Gravitation

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer. Let's work this out in a step by step way to be sure we have the right answer.

Context: a derivation in mathematical Physics consists of a sequence of steps. Each step relates mathematical expressions to an inference rule. An expression is comprised of symbols and operators. An inference rule typically transforms input expressions into output expressions.

provide a derivation of Newton's Law of Universal Gravitation

See https://physicsderivationgraph.blogspot.com/2023/06/finding-derivations-of-newtons-law-of.html

Convert derivation steps to a proof in Lean

TODO:

  • define what I mean by a derivation
  • Explain lean
  • Provide example
  • Emphasize correctness and precision

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer. Let's work this out in a step by step way to be sure we have the right answer.

Context: Lean is a proof assistant and programming language. Lean is based on the calculus of constructions with inductive types.

`a=b` is a proposition. The variable `a` is Real valued and `b` is Real valued. Prove in Lean that `(a=b) -> (a+2=b+2)`

Identify symbols in latex arxiv papers

TODO:

  • Provide example
  • Emphasize correctness and precision

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer.

Context: Latex is software for document preparation to generate PDF files. Mathematical equations can be expressed in Latex using markup syntax. 

Question: What mathematical expressions are present in the following Latex?

```
\begin{equation}
a = b + c
\end{equation}
Assuming you know $b$, then
\begin{equation}
b*c = f
\end{equation}
```

Answer:

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer.

Context: Latex is software for document preparation to generate PDF files. Mathematical equations can be expressed in Latex using markup syntax. 

Question: What variables are present in the following Latex?

```
\begin{equation}
a = b + c
\end{equation}
Assuming you know $b$, then
\begin{equation}
b*c = f
\end{equation}
```

Answer:

best practices for prompts for large language models - LLMs

ChatGPT was made available by OpenAI on 2022-11-30. As of 2023-12-16 I hadn't used ChatGPT (Generative Pre-trained Transformer) or other large language models (LLMs). In this post I document best practices other folks have come up with. My intent is to identify whether ChatGPT could be useful for tasks relevant to the Physics Derivation Graph. 

Sites I reviewed for best practices for prompts:

General text response

Prompt

Answer the question based on the context below. Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer.

Context: <context>

Question: <question>

Answer:

Prompt

Before providing the answer, state what you know and what you assume. Then state the steps that led you to your answer. Finally, provide the answer.

Prompt

Let's think step by step.

Prompt

Let's work this out in a step by step way to be sure we have the right answer.

Prompt

Imagine three different experts are answering this question.
All experts will write down 1 step of their thinking, then share it with the group.
Then all experts will go on to the next step, etc.
If any expert realizes they are wrong at any point then they leave.
The question is...


What to use an LLM for

The following table is from https://www.philschmid.de/instruction-tune-llama-2
Capability Example Instruction
Brainstorming Provide a diverse set of creative ideas for new flavors of ice cream.
Classification Categorize these movies as either comedy, drama, or horror based on the plot summary.
Closed QA Answer the question 'What is the capital of France?' with a single word.
Generation Write a poem in the style of Robert Frost about nature and the changing seasons.
Information Extraction Extract the names of the main characters from this short story.
Open QA Why do leaves change color in autumn? Explain the scientific reasons.
Summarization Summarize this article on recent advancements in renewable energy in 2-3 sentences.


Software generation

You are an expert programmer that writes simple, concise code and explanations. Write a python function to generate the nth fibonacci number.
 
A simple python function to remove whitespace from a string
  Code Llama supports a special prompt called infill
<PRE> def compute_gcd(x, y): <SUF>return result <MID>

Code review

Where is the bug in this code?
```
def fib(n):
    if n <= 0:
        return n
    else:
        return fib(n-1) + fib(n-2)
```

Tests of Software

write a unit test for this function:

Prompts for Retrieval Augmented Generation (RAG)

RAG = https://www.promptingguide.ai/techniques/rag 

    You are an expert Q&A system that is trusted around the world.\n"
        "Always answer the query only using the provided context information, "
        "and not prior knowledge."
        
 
      "Context information is below.\n"
            "---------------------\n"
            "{context_str}\n"
            "---------------------\n"
            "Given the context information and not prior knowledge, "
            "answer the query.\n"
            "Query: {query_str}\n"
            "Answer: "
      

 

From https://docs.llamaindex.ai/en/stable/examples/prompts/prompts_rag.html

Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, 
answer the query asking about citations over different topics.
Some examples are given below.

{few_shot_examples}

Query: {query_str}
Answer: