Tuesday, June 6, 2023

historical evolution of a git repo

JSON-like output
git log --date=format-local:'%Y-%m-%d %H:%M:%S' \
--pretty=format:'{%n  "commit": "%H",%n  "author": "%aN <%aE>",%n  "date": "%ad",%n  "message": "%f"%n},' > all_logs.dat

as per https://stackoverflow.com/a/34778736/1164295 and https://gist.github.com/textarcana/1306223 which points to https://gist.github.com/textarcana/1306223

python3 -c "import json; 
with open('all_entries','r') as fh:
    content = json.load(fh)
print(content)"
Single line is better:
git log --date=format-local:'%Y-%m-%d %H:%M:%S' --pretty=format:"%H%x09%ae%x09%ad%x09%s" > all_hash

TODO:

  • how many commits per year?
  • sample the git repo at a given frequency and count number of files in the sample

general approach:

git clone [remote_address_here] my_repo
cd my_repo
git reset --hard [ENTER HERE THE COMMIT HASH YOU WANT]

as per https://stackoverflow.com/a/3555202/1164295

loop over relevant hashes:
git clone https://github.com/allofphysicsgraph/proofofconcept.git
cd proofofconcept
find . -type f | grep -v ".git" | wc -l
    3381
git reset --hard f12795798d2537d3fec80ba2b4d33396e52011bd
find . -type f | grep -v ".git" | wc -l
       2
number of commits in a year:
cat all_hash | grep 2014- | wc -l
      17
for year in {2014..2023}; do commits_per_year=`cat all_hash | grep ${year}- | wc -l`; echo $year $commits_per_year; done
2014 17
2015 234
2016 62
2017 41
2018 81
2019 30
2020 790
2021 67
2022 90
2023 5
for year in {2014..2023}; do this_hash=`cat all_hash | grep $year | head -n 1 | cut -c-40`; git reset --hard $this_hash; file_count=`find . -type f | grep -v ".git" | wc -l`; echo $this_hash $year $file_count; done > counts_per_year.dat
cat counts_per_year.dat | grep -v HEAD
4289c2a3311d4e051bdab3b0d99f49b25dab6bc3 2014 1027
b81d6ddba5a2015d328975607318d7e7755b27aa 2015 3339
26b0d9fc8c49ede12c897b4bf4cd050765747a81 2016 2098
eec25f59649a4cc9e9e8b166355793b58b742672 2017 2194
201822fd2025349f8749b9433533d0d54c7363b3 2018 3007
918245c17bece668f868ce7201976e2d49dc1743 2019 3022
bd4fb0528c1a46ed2fac13aa16f77508aaa43e67 2020 3150
7dd27b734673e20e405cd26acbdf7d237cf73e33 2021 3343
ad8dfc5931922788f32a21f10906d97c50f7ca36 2022 3384
9df026b16827dfe97fc8a44c4063e493c21a49d4 2023 3384

No comments:

Post a Comment