docker run -it --rm -v `pwd`:/scratch llama-cpp-with-mistral-7b-v0.1.q6_k:2023-12-22 /bin/bash
root@dc98ac4a23d5:/opt/llama.cpp# ./main -h
usage: ./main [options]
options:
-h, --help show this help message and exit
--version show version and build info
-i, --interactive run in interactive mode
--interactive-first run in interactive mode and wait for input right away
-ins, --instruct run in instruction mode (use with Alpaca models)
-cml, --chatml run in chatml mode (use with ChatML-compatible models)
--multiline-input allows you to write or paste multiple lines without ending each in '\'
-r PROMPT, --reverse-prompt PROMPT
halt generation at PROMPT, return control in interactive mode
(can be specified more than once for multiple prompts).
--color colorise output to distinguish prompt and user input from generations
-s SEED, --seed SEED RNG seed (default: -1, use random seed for < 0)
-t N, --threads N number of threads to use during generation (default: 20)
-tb N, --threads-batch N
number of threads to use during batch and prompt processing (default: same as --threads)
-p PROMPT, --prompt PROMPT
prompt to start generation with (default: empty)
-e, --escape process prompt escapes sequences (\n, \r, \t, \', \", \\)
--prompt-cache FNAME file to cache prompt state for faster startup (default: none)
--prompt-cache-all if specified, saves user input and generations to cache as well.
not supported with --interactive or other interactive options
--prompt-cache-ro if specified, uses the prompt cache but does not update it.
--random-prompt start with a randomized prompt.
--in-prefix-bos prefix BOS to user inputs, preceding the `--in-prefix` string
--in-prefix STRING string to prefix user inputs with (default: empty)
--in-suffix STRING string to suffix after user inputs with (default: empty)
-f FNAME, --file FNAME
prompt file to start generation.
-n N, --n-predict N number of tokens to predict (default: -1, -1 = infinity, -2 = until context filled)
-c N, --ctx-size N size of the prompt context (default: 512, 0 = loaded from model)
-b N, --batch-size N batch size for prompt processing (default: 512)
--samplers samplers that will be used for generation in the order, separated by ';', for example: "top_k;tfs;typical;top_p;min_p;temp"
--sampling-seq simplified sequence for samplers that will be used (default: kfypmt)
--top-k N top-k sampling (default: 40, 0 = disabled)
--top-p N top-p sampling (default: 0.9, 1.0 = disabled)
--min-p N min-p sampling (default: 0.1, 0.0 = disabled)
--tfs N tail free sampling, parameter z (default: 1.0, 1.0 = disabled)
--typical N locally typical sampling, parameter p (default: 1.0, 1.0 = disabled)
--repeat-last-n N last n tokens to consider for penalize (default: 64, 0 = disabled, -1 = ctx_size)
--repeat-penalty N penalize repeat sequence of tokens (default: 1.1, 1.0 = disabled)
--presence-penalty N repeat alpha presence penalty (default: 0.0, 0.0 = disabled)
--frequency-penalty N repeat alpha frequency penalty (default: 0.0, 0.0 = disabled)
--mirostat N use Mirostat sampling.
Top K, Nucleus, Tail Free and Locally Typical samplers are ignored if used.
(default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
--mirostat-lr N Mirostat learning rate, parameter eta (default: 0.1)
--mirostat-ent N Mirostat target entropy, parameter tau (default: 5.0)
-l TOKEN_ID(+/-)BIAS, --logit-bias TOKEN_ID(+/-)BIAS
modifies the likelihood of token appearing in the completion,
i.e. `--logit-bias 15043+1` to increase likelihood of token ' Hello',
or `--logit-bias 15043-1` to decrease likelihood of token ' Hello'
--grammar GRAMMAR BNF-like grammar to constrain generations (see samples in grammars/ dir)
--grammar-file FNAME file to read grammar from
--cfg-negative-prompt PROMPT
negative prompt to use for guidance. (default: empty)
--cfg-negative-prompt-file FNAME
negative prompt file to use for guidance. (default: empty)
--cfg-scale N strength of guidance (default: 1.000000, 1.0 = disable)
--rope-scaling {none,linear,yarn}
RoPE frequency scaling method, defaults to linear unless specified by the model
--rope-scale N RoPE context scaling factor, expands context by a factor of N
--rope-freq-base N RoPE base frequency, used by NTK-aware scaling (default: loaded from model)
--rope-freq-scale N RoPE frequency scaling factor, expands context by a factor of 1/N
--yarn-orig-ctx N YaRN: original context size of model (default: 0 = model training context size)
--yarn-ext-factor N YaRN: extrapolation mix factor (default: 1.0, 0.0 = full interpolation)
--yarn-attn-factor N YaRN: scale sqrt(t) or attention magnitude (default: 1.0)
--yarn-beta-slow N YaRN: high correction dim or alpha (default: 1.0)
--yarn-beta-fast N YaRN: low correction dim or beta (default: 32.0)
--ignore-eos ignore end of stream token and continue generating (implies --logit-bias 2-inf)
--no-penalize-nl do not penalize newline token
--temp N temperature (default: 0.8)
--logits-all return logits for all tokens in the batch (default: disabled)
--hellaswag compute HellaSwag score over random tasks from datafile supplied with -f
--hellaswag-tasks N number of tasks to use when computing the HellaSwag score (default: 400)
--keep N number of tokens to keep from the initial prompt (default: 0, -1 = all)
--draft N number of tokens to draft for speculative decoding (default: 8)
--chunks N max number of chunks to process (default: -1, -1 = all)
-np N, --parallel N number of parallel sequences to decode (default: 1)
-ns N, --sequences N number of sequences to decode (default: 1)
-pa N, --p-accept N speculative decoding accept probability (default: 0.5)
-ps N, --p-split N speculative decoding split probability (default: 0.1)
-cb, --cont-batching enable continuous batching (a.k.a dynamic batching) (default: disabled)
--mmproj MMPROJ_FILE path to a multimodal projector file for LLaVA. see examples/llava/README.md
--image IMAGE_FILE path to an image file. use with multimodal models
--mlock force system to keep model in RAM rather than swapping or compressing
--no-mmap do not memory-map model (slower load but may reduce pageouts if not using mlock)
--numa attempt optimizations that help on some NUMA systems
if run without this previously, it is recommended to drop the system page cache before using this
see https://github.com/ggerganov/llama.cpp/issues/1437
--verbose-prompt print prompt before generation
-dkvc, --dump-kv-cache
verbose print of the KV cache
-nkvo, --no-kv-offload
disable KV offload
-ctk TYPE, --cache-type-k TYPE
KV cache data type for K (default: f16)
-ctv TYPE, --cache-type-v TYPE
KV cache data type for V (default: f16)
--simple-io use basic IO for better compatibility in subprocesses and limited consoles
--lora FNAME apply LoRA adapter (implies --no-mmap)
--lora-scaled FNAME S apply LoRA adapter with user defined scaling S (implies --no-mmap)
--lora-base FNAME optional model to use as a base for the layers modified by the LoRA adapter
-m FNAME, --model FNAME
model path (default: models/7B/ggml-model-f16.gguf)
-md FNAME, --model-draft FNAME
draft model for speculative decoding
-ld LOGDIR, --logdir LOGDIR
path under which to save YAML logs (no logging if unset)
--override-kv KEY=TYPE:VALUE
advanced option to override model metadata by key. may be specified multiple times.
types: int, float, bool. example: --override-kv tokenizer.ggml.add_bos_token=bool:false
log options:
--log-test Run simple logging test
--log-disable Disable trace logs
--log-enable Enable trace logs
--log-file Specify a log filename (without extension)
--log-new Create a separate new log file on start. Each log file will have unique name: "<name>.<ID>.log"
--log-append Don't truncate the old log file.
In a new conversation I provided the following prompt:
ChatGTP 3.5 wrote in response
On my computer I created a file "second_chatGPT_attempt.lean" and wrote
variables {a b : ℝ}
example (h : a = b) : a + 2 = b + 2 :=
begin
calc
a + 2 = b + 2 : by rw h
end
Posing a prompt that gets a useful result currently requires some consideration. Below are some possible tasks for LLMs, along with additional context for the LLM.
"period is the reciprocal of the frequency: f = 1/T."
As of 2025-01-25 on https://aistudio.google.com/, "Gemini 2.0 Flash Thinking Experimental" returns the following:
Thoughts
"Gemini 2.0 Flash Thinking Experimental" answer
Find arxiv papers with derivations
How to improve chances of success:
Explain arxiv
define what I mean by a derivation
Provide example citations
Identify derivation steps between physics equations
How to improve chances of success:
define what I mean by a derivation
Provide example steps
Right answer: Raise both sides as the power of $\exp$
As of 2025-01-25 on https://aistudio.google.com/, "Gemini 2.0 Flash Thinking Experimental" returns the following:
Thoughts
"Gemini 2.0 Flash Thinking Experimental" answer
Right answer: $\frac{d}{dx} y = -\sin(x) + i\cos(x)$
As of 2025-01-25 on https://aistudio.google.com/, "Gemini 2.0 Flash Thinking Experimental" returns the following:
Thoughts
"Gemini 2.0 Flash Thinking Experimental" answer
Derive the wave function for a quantum particle in a 1D box
I removed "Keep the answer concise. Respond "Unsure about answer" if not sure about the answer. Let's work this out in a step by step way to be sure we have the right answer." from the prompt.
As of 2025-01-25 on https://aistudio.google.com/, "Gemini 2.0 Flash Thinking Experimental" returns the following:
Thoughts
"Gemini 2.0 Flash Thinking Experimental" answer
Convert derivation steps to a proof in Lean
How to improve chances of success:
define what I mean by a derivation
Explain Lean
Provide example
Emphasize correctness and precision
I removed "Keep the answer short and concise. Respond "Unsure about answer" if not sure about the answer. Let's work this out in a step by step way to be sure we have the right answer." from the prompt.
As of 2025-01-25 on https://aistudio.google.com/, "Gemini 2.0 Flash Thinking Experimental" returns the following:
Thoughts
"Gemini 2.0 Flash Thinking Experimental" answer
I then clarified
and Gemini provided
I tried compiling with Lean and learned the syntax was incorrect. The following is closer to correct
import Mathlib.Data.Real.Basic
variable (a b : Real)
example : (a = b) -> (a + 2 = b + 2) := by
intro h
rw [h]
exact rfl
but has the error "no goals to be solved"
Identify symbols in latex arxiv papers
How to improve chances of success:
Provide example
Emphasize correctness and precision
As of 2025-01-25 on https://aistudio.google.com/, "Gemini 2.0 Flash Thinking Experimental" returns the following:
Thoughts
"Gemini 2.0 Flash Thinking Experimental" answer
As of 2025-01-25 on https://aistudio.google.com/, "Gemini 2.0 Flash Thinking Experimental" returns the following:
ChatGPT was made available by OpenAI on 2022-11-30. As of 2023-12-16 I hadn't used ChatGPT (Generative Pre-trained Transformer) or other large language models (LLMs). In this post I document best practices other folks have come up with. My intent is to identify whether ChatGPT could be useful for tasks relevant to the Physics Derivation Graph.
Strong understanding = unique, well-tested theory that accounts for all the data we have within the domain of applicability (defined by SMC at 14:00).
Consequences: There are no competing theories. Modifications or extensions are feasible, but the domain is "solved." The theory won't be displaced.
Examples: general relativity, Newtonian dynamics, Newtonian gravity.
Weak understanding = more than one theory can account for the data. Unable to discriminate which is relevant since theories make the same predictions. (defined by SMC at 15:44)
Consequences: Not clear theory which is right. Right theory may not have been invented yet.
Examples:
Foundations of quantum mechanics (e.g., Copenhagen interpretation, Bohemian, many worlds, jaggets platz)
dark matter and dark energy in cosmology. Properties are known, but multiple theories
dark matter: WIMPS, axions,
dark energy: vacuum energy, dynamical quintesense-like fields
No understanding = have data but no theory (defined by SMC at 18:20)
Examples: what happens at or before the big bang
SMC's claim: We have either a strong or weak understanding of everything that is accessible through measurement. (at 21:40) There's nothing that's experimentally accessible and not understood. That's new!
Survey of domains and relations
What is it that we know?
Newtonian dynamics. Space is separate from time. Deterministic Laplacian evolution.
Theory of relativity (1905, Einstein) explains space-time (as per Minkowski, 1907). (SMC: 29:22)
Special Relativity: how space-time is when gravity is not important; when space-time is flat. (SMC 30:20)
General Relativity: space-time can be curved and that curvature is gravity. Predicts big bang, black holes. SMC 30:10)
The HTML from .tex effort is a collaboration between Deyan Ginev (a student of Dr. Kohlhase) and Bruce Miller (at NIST - https://www.nist.gov/people/bruce-r-miller).
Kohlhase's group (https://kwarc.info/research/) focuses on semantic enrichment of Latex. Bruce provided the software to convert Latex.
The reason for this Latex to HTML conversion is because it's the first step for enabling semantic enrichment of content on the arxiv. There's immediate benefit for arxiv being able to support HTML, which I suspect is why arxiv cooperated with Kohlhase's group.
In the long term I see a need to connect semantic tagging (the focus of Kohlhase's group) with formal verification (e.g., derivations-as-proofs using Lean). The formally verified math expressions need to be tied to use in narrative text (e.g., arxiv papers). For example, if I'm referring to "x" in a publication, is that the same "x" specified in a Lean-based proof? One way of answering is to use tags like
<unique_variable_id=42>x</unique_variable_id>
in the narrative text document, and then have Lean reference the ID 42 in derivations.
There are more conventional uses of tags like
but those tags don't address the question of whether two variables refer to the same concept.
Summary
I predict that the conversion of arxiv .tex files to HTML will enable semantic tagging.
This will intersect with the challenge of "how do I relate the use of variables and expressions across multiple proofs in Lean?"
Papers published in arxiv using Latex contain narrative text and mathematical expressions. How could we have confidence that the content of a published paper is related to a formal proof?
The narrative text would refer to symbols, expressions, units, and quantities. The expressions and symbols could be embedded in the Latex source using a package that provides macros.
The author's .tex would include loading a package like
Then when the author references an expression they would use
\begin{equation}
\pdgexp{192839}
\end{equation}
When compiled to PDF, the macro \pdgexp calls to the server address http://serveraddressforpdg.net to return the Latex expression. The same could be done for an inline reference to a symbol, like \pdgsymbol{93831911}.
In this approach there wouldn't need to be an appendix for the derivation because the derivation steps are hosted on the PDG server.
The chain of custody would be
arxiv PDF -- arxiv .tex -- .tex contains references to PDG expression and symbol IDs -- PDG server contains derivation steps -- steps are validated using Lean
The macro might need to cache the PDG database locally so the server doesn't get called every time the .tex is compiled PDF. Also, allow the author to force a refresh of the cache.
Suppose I have a Python program that takes "angle" and "initial speed" as arguments and calculates where a ball shot from a cannon lands. The program could be described using the function
f(\theta, s) = d
where \theta is the angle, s is the speed, and d is the distance from the cannon where the ball lands. The function could be expressed using a variety of languages like Python, C, Lean, etc.
If f(\theta, s) = d is derived from the equations of motion (like https://derivationmap.net/review_derivation/201726/) and each step is verified using Lean (as described in this post), then we can use Lean to calculate d from \theta and s. The difference would be that there's more explanation of where the result came from. Rather than just "here the equation to use," including the derivation step and proof of each step makes the assumptions explicit and the implementation correct.
All software programs can be described as functions that take arguments and return results. That doesn't mean that all functions are derived. As an example of software that is a function but isn't derived, consider the A* routing algorithm. An algorithm outlines a sequence of steps for accomplishing a function.
While all programs could be written in Lean (because each program is a function), not all functions are derived mathematically. As a trivial example, printing "hello world" is not a derived function. In contrast, the first example in this post, the distance a projectile flies, is derived from the equations of motion. What distinguishes "hello world" and A* routing from the cannon ball trajectory?
The distinction is that the cannon ball example is a numerical simulation of a tangible scenario, whereas "hello world" is not describing physically-constrained reality.
For numerical simulations, Lean can formally verify the derivation and the formula used for analytical calculation. Keeping both aspects in Lean results in a clear chain of custody.
For algorithms (e.g., A* routing, "hello world"), the verification is to show that the implementation is consistent with the requirements. Here languages like Dafny and F* are relevant.
In this post I explore the concept that each derivation can be phrased as a proof of the initial declaration and the final result. Also, each step in a derivation can be phrased as a proof.
In comparison, a PDG Inference Rule is more generic:
add __ to both sides of (LHS=RHS) to get (LHS+__=RHS+__)
The PDG Inference Rule is generic because the types of LHS and RHS are undefined. Are they Real? Complex? Matrices?
The relevance of types for inference rules is because mixing types may not work. "Add 1 to both sides of (LHS=RHS)" won't work if LHS and RHS are 2x2 matrices.
I'm beginning to appreciate that a computer algebra system (CAS) is not sufficient for the Physics Derivation Graph (PDG). The evidence of this is that the variables I'm defining can be real or complex; that is not distinguished by the computer algebra system.
The simple story of starting with
a = b
and then adding 2 to both sides to get
a+2 = b+2
is appropriate for a computer algebra system. But if "a" is a matrix then the operation is invalid. The only way to distinguish "scalar a" from "vector a" from "matrix a" is to specify the difference. [SymPy does have support for predicates -- https://docs.sympy.org/latest/guides/assumptions.html#predicates .]
In the Physics Derivation Graph I want to be more specific about the possible values of a and b.
Even for non-physics derivations like the Euler equations, there are assumptions about the possible values of each variable.
I don't know Lean, but I also don't know the foundational concepts of theorems and proofs.
Is a step a theorem?
Is a derivation a theorem?
The role of inference rules in steps in derivations does not map to anything in a theorem.
Performant parallelized software (e.g., MPI + OpenMP + Fortran or C/C++) for High Performance Computing (HPC).
Peer-reviewed journal articles, e.g., arxiv; rendered using Latex. [Of the four, this has historically been closest to the focus of the Physics Derivation Graph.]
Each of these content sources are independent and involve expressing a mathematical model in different ways.
While kernels of mathematical models can be implemented in formally-verified software, a lot of scientific computing uses High Performance Computing (HPC). In the HPC realm the compilers and libraries and runtimes can all be sources of bugs.
Need to compile or interpret software with a verified compiler or interpreter.
How can the consistency of content across sources be ensured?
There are 6 bilateral transitions for the 4 sources. For example, how does the model transition between theorem prover (e.g., Lean) and formally-verified software implementation?
Another possible issue: provenance
May need to track provenance of the source code to binaries. Is the verified algorithm what was provided to the compiler? Is the binary I'm currently using related to the compiler output? Need to be able to distinguish "binary I don't know the origin of" from "binary that has a known chain of custody."
This could be just a log of hashes coupled to metadata for the "where did this come from." The provenance needs to include versions for compilers and libraries and runtimes. The binary should be reproducible. See https://en.wikipedia.org/wiki/Reproducible_builds
Question: was Newton's law of universal gravitation found empirically (from measurements) or can it be derived mathematically? The short answer is yes -- both. The historical process is separate from what can be done now.
distance between Earth and Moon = 60*(radius of earth)
radius of Earth =
orbital period of moon = 27.32 days
Newton figured out how fast the Moon is circling the Earth using
velocity = distance/time
where, in the case of the moon circling the Earth,
distance the moon travels around the Earth = circumference = 2*pi*r_{orbit of Moon}
Plugging in numbers, velocity of Moon around Earth = 1022 m/s.
That can then be plugged into the centripetal acceleration,
a_{centripetal} = v^2/r
How does a_{centripetal} compare to the gravitational acceleration g?
g/a_{centripetal} = 60^2
Noticing the 60 is common to the ratio and the distance between the Earth and the Moon, Newton figures that gravitation follows an inverse square law. Newton then checked this against data from observational studies of planets.
That's a big leap to F = G*(M*m)/(r^2), so are there more mathematical routes?
Mathematical Derivations
From nasa.gov
I first found NASA's classroom materials. There are some leaps that I wasn't able to follow. The same content of that page is duplicated on a related NASA page. The derivation starts with F=ma and a_{centripetal} = v^2/r. The author mentions Kepler's Third Law but says T^2 \approx R^3 (which is dimenionally inconsistent) when they mean T^2 \propto R^3. The misuse of \approx and \propto continues throughout the rest of the derivation.
velocity = distance/time
and the distance here is the circumfrence, 2*pi*r, so
period T = (2*pi*R)/v
Drop the 2*pi to get
period v \approx R/T
Square both sides and apply Kepler's Third law
T^2 \propto R^3
to get
v^2 \propto 1/R
The second source of my confusion is subscripting versus superscripting -- v_2 versus v^2.
F = (m*(v^2))/R
I tried submitting a correction to NASA's feedback page but couldn't since the Captcha is missing. :(
From a tutorial
Next I found another student-oriented page that has a derivation which is even less helpful than NASA's. The derivation presented starts from angular speed and F=mr\omega^2.
What would it take to integrate support for symbol detection and conversion to SymPy for a single step in a derivation?
user provides initial expression in Latex to web UI.
computer parses symbols and operators from Latex
computer searches Physics Derivation Graph database of symbols and operators to find candidate symbols
computer provides candidate symbols to user and prompts, "which of the following symbols were you referring to?"
computer parses expression to SymPy, returns AST to user, and prompts, "is this the AST you meant?"
if yes, continue; if no, go back to step 1 or provide corrections to AST.
user provides next expression in Latex
computer parses symbols and operators from Latex
if symbols match symbols used in this derivation, then associate with those; otherwise
computer searches Physics Derivation Graph database of symbols and operators to find candidate symbols
if computer had to search PDG database, then computer provides candidate symbols to user and prompts, "which of the following symbols were you referring to?"
computer parses expression from step 7 to SymPy, returns AST to user, and prompts, "is this the AST you meant?"
computer uses brute force to check every inference rule using a CAS against the provided expressions to "guess" the inference rule.
if valid inference rule is found, continue to next expression; if no valid inference rule is found, prompt user to provide inference rule.
Given the inference rule and associated expressions, use the CAS to verify the step.
Large Language Models like ChatGPT are a hot topic due to the novelty of results in multiple application domains. Stepping back from the hype, the central capabilities seem to include summarization of content, information retrieval, and creative synthesis. Unfortunately those are not separate categories -- the summarization or information retrieval can contain hallucinations that get stated confidently.
Focusing on the topic of information retrieval and setting aside hallucinations, let's consider alternative mechanisms for search:
plain text search, like what Google supports
boolean logic, i.e., AND/OR/NOT
use of special indicators like wild cards, quotes for exact search
regular expressions
graph queries for inference engines that support inductive, deductive, and abduction
Except for the last, those search mechanisms all return specific results from a previously collected set of sources.
--> I expect conventional search to remain important. There are cases where I really am looking for a specific document and not a summarization.
--> Specialized search capabilities like regular expressions and wild cards will remain relevant for matching specific text strings. An LLM might provide suggestions on designing the regex?
--> Graph queries rely on bespoke databases that LLMs are not trained on currently. I'm not aware of any reason these can't be combined.
The Physics Derivation Graph effectively provides a knowledge graph for mathematical Physics. Combining this with machine learning is feasible.
Semantic tagging of documents has the potential of enriching the reader's experience because content is easier to search. The burden of work is on the document author to provide the right tags. Worse, the document author has to find tags that are common to uses in other documents -- consistency of tags is necessary for search. This extra work of 1) tagging and 2) using consistent tags are reasons semantic enrichment hasn't become mainstream.
The Physics Derivation Graph faces a similar challenge. If the Physics Derivation Graph relies on using appropriately annotated symbols (effectively equivalent to a subset of semantic tags), then the PDG has the same burdens of work on individual authors.
The incentive for the individual researcher authoring a paper to use the Physics Derivation Graph is when there's integration with a computer algebra system that can check the correctness of steps. Then the author benefits from immediate feedback before sharing with others for review.
Annotating symbols probably isn't sufficient to motivate the work, but integration with a computer algebra system could provide incentive. Currently, the use of a computer algebra system requires detailed steps to be specified by the author.
There are ways to partially automate both symbol annotation and specifying steps. For symbol annotation, the computer could guess from context which symbols are being used. In a similar reliance on context, the user could provide leaps in specifying a derivation that the computer then tries to fill in with the detailed steps.
The training data doesn't have inference rules, so the output from the LLM doesn't have inference rules. As a consequence, the output of the LLM cannot be confirmed by a Computer Algebra System. The output text needs to be validated by a human. LLMs are hallucinating answers that sound reasonable, so checking each step is still vital.
The ability to resolve distinct variables across all of Mathematical Physics is beyond the scope of the training data.
On a positive note, if the Physics Derivation Graph content existed, I now think an LLM-based approach could be used to make progress in Mathematical Physics.
The Physics Derivation Graph relies on a diverse set of skills. To speed up the development process we could identify separable tasks and then spread the tasks among a team of contributors. Collaboration requires coordination, and that coordination can be top-down or organic or a mixture.
This post focuses on the top-down approach and assumes an omniscient view.
A standard data structure in project management is the Gantt chart. A Gantt chart uses information about tasks, task dependencies, task durations, and dates to create a visualization associated with a project.
task ID
task description
task duration [days]
depends on tasks
earliest start date
14235
something useful
3
N/A
2022-03-01
25532
hard work
2
[14235]
N/A
3456252
keeping busy
3
[25532]
N/A
That table can be visualized with tasks versus time:
Visualization of a Gantt chart. Four tasks. Task 2 and 3 depend on task 1 being completed. Task 4 depends on task 2 being completed.
That data structure doesn't account for staffing, skills, equipment, or budget. The Gantt chart doesn't account for uncertainty of task duration, nor alternative paths.
Gantt charts present a single path
Project management involves contingency planning.
IF this THEN
that
ELSE
other
Every conditional statement is a branching of possible paths, each a separate Gantt chart.
A single Gantt chart is a snapshot of a single path.
Staffing, budget, equipment, skills, uncertainty
Augmenting the basic Gantt chart means extending the table data structure to something like
task ID
task description
task duration [days]
depends on tasks
earliest start date
depends on equipment
minimum skill set and level
uncertainty in duration [days]
14235
something useful
3
N/A
2022-03-01
[Photoshop]
photo editing, intermediate
+/-1
25532
hard work
2
[14235]
N/A
[Excel]
math, beginner; text editing, beginner
+2
3456252
keeping busy
3
[25532]
N/A
Chrome browser
clicking on links, beginner
0
That information needs to be augmented with a cost table for equipment:
equipment
cost per day [USD]
acquisition cost [USD]
Photoshop
0
100
Excel
0
20
Chrome browser
0
0
Lastly, we need people who can do the tasks.
person name
hourly cost [USD]
skill and level
dates available
Alice
20
Excel, beginner; text editing, intermediate
[2022-01-02, 2022-01-03, 2022-01-04]
Bob
15
Excel, intermediate; Math, beginner
[2022-02-01, 2022-02-15, 2022-02-24]
Charlie
24
photo editing, beginner
[2022-01-12, 2022-01-23, 2022-01-24]
Caveat: the above model is focused exclusively on experts doing tasks using equipment. The model does not account for managers, nor does the model account for support staff. The model could include any scope of tasks, but a boundary needs to be drawn somewhere to avoid becoming Seldon's psychohistory. The problem of tasks existing outside the model is the risk that out-of-model tasks block project progress or alter the project cost. Anything outside the model could be an invisible dependency.
Derived project views
The following views can be derived from the three tables above:
standard task Gantt chart
project duration histogram. (Each "task duration" has an uncertainty that can be aggregated and results in variance.)
per-person activity schedule -- who works on what when. A prioritization based on task dependencies and when people with skills are available
cost per day -- the spend rate
Blockers inhibiting the omniscient project management view
Even though the issue can be formulated into a set of data structures
Blockers are the amount of time needed to
gather the relevant information and keep the information up-to-date as the situation evolves
document the information and tune the model
survey the skills of the workforce
have the workforce track their time spent on a task
define each task, along with a "definition of done"
track costs of the workforce and equipment
identifying conditional paths and tracking which alternative was selected
Blockers that aren't time-based:
inability to break project into atomic tasks (where "atomic" is based on skill set and skill level)
can break project into tasks, but unable to identify relevant skills for task
The centralization of this model is merely to construct a narrative. Software like Atlassian's Jira is a way of distributing the task tracking rather than try to administer the project in a centralized top-down approach.
Conclusion
The top-down omniscient view of project management is an unrealistic fantasy. However, it might be a helpful artifact for negotiation among decision makers. The alternative (where decision makers don't have a quantitative model to argue about) devolves into reliance on personal relationships, turf battles, and political factions. Bureaucratic processes evolve as a substitute for the lack of top-down omniscient view of project management.