Git is ideal for tracking changes between text-like files, such as code, static csv lookup tables, your to-do list, etc. Other ‘binary’ files (like .docx or .jpeg) can be included too but a hidden copy will be saved every time it is changed (in .git/
) which can get quite bloated. In this demo, we will create a temporary folder of pseudo-code and see how we can use git to keep parallel versions, view, and restore history.
> library(git4r)
> myproject = tempfile(pattern = 'test-git-project-')
> dir.create(myproject)
> setwd(myproject)
> # Make some pseudo files for our repository
> for(my_code_files in LETTERS[1:5]) write(sample(letters,10), file=my_code_files)
We can create a checkpoint (a commit) whenever we want which will save the contents of any files we choose to add. We can then bring back this version of the file any time in the future, and see the short message describing the change. Normally we want to add all files to our commit at once (backup everything) but you might want to split the changes in two, for example add one file with its own commit message, and an unrelated change in another.
You will be asked first whether to convert the normal directory to a git repository – always make sure you are in the right working directory so you don’t start git-tracking your entire home directory!
> git_add()
test-git-project-40a3cbf1356 directory is not inside a git repository - check this is the top level else ESCAPE and setwd()
Should this directory be turned into a git repo? (Y/N) Y
Copying default .gitignore to working directory
test-git-project-40a3cbf1356 is now a git repo
Modified files to be added (+ created, - deleted, * changed since added, ? conflict)
1 + .gitignore
2 + A
3 + B
4 + C
5 + D
6 + E
Which file numbers to add? (Hit ENTER to add all non-conflicting, else ESCAPE) 1 2 3 4
Adding 4 file(s)
After adding a couple of files, we will re-run git_add()
and see that we are given the option to remove any we regret. This time we add ALL files by hitting enter.
> git_add()
Changes staged already
1 + .gitignore
2 + A
3 + B
4 + C
Any file numbers to un-add? (Hit ENTER to keep all)
Modified files to be added (+ created, - deleted, * changed since added, ? conflict)
1 + D
2 + E
Which file numbers to add? (Hit ENTER to add all non-conflicting, else ESCAPE)
Adding 2 file(s)
We can now commit all of these changes – our first checkpoint
> git_commit()
Using global (system default) identity: myname <myname@email>
Commit message: My first commit
Commit to master? (hit ESCAPE to cancel)
Done
If we now make a whole bunch of changes, we will be able to see the new and old version of each file, and our helpful commit message which should briefly describe with one sentence what was changed and why. The ‘who’ and ‘when’ are recorded automatically.
> for(even_more_files in LETTERS[4:7]) write(sample(1:10,5), file=even_more_files, append=TRUE)
> git_add()
Modified files to be added (+ created, - deleted, * changed since added, ? conflict)
1 D
2 E
3 + F
4 + G
Which file numbers to add? (Hit ENTER to add all non-conflicting, else ESCAPE)
Adding 4 file(s)
> git_commit()
Using global (system default) identity: myname <myname@email>
Commit message: Add some more files and made some changes
Commit to master? (hit ESCAPE to cancel)
Done
We can view the change history of a folder (always relative to the top directory) or only changes which change a particular file. This gives us a unique identifier code for this commit which is a unique hash, as well as timestamp and the message.
> git_history()
1 [facbb79] 2022-01-20: Add some more files and made some changes
2 [d8759cf] 2022-01-20: My first commit
We can look at what has changed between each of these commits using git_diff()
on either a folder or within specific file. We can specify which commit we want to compare using filters like ‘before’ or ‘message’, or use NULL to get the the version right now. See ?git_history
for the full list.
Here we compare the second most recent commit n=2
with the most recent n=1
.
> git_diff(path='.', n=2, n=1)
The default behaviour is to compare current working directory version with most recent commit – showing what will be added (or lost if you git_undo()
)
> write('uncommitted change', 'F', append=TRUE)
> git_diff(path='F')
An important feature of git is keeping parallel versions of the same working directory called branches, for example if you want to develop some new code but have a stable working version you can instantly change back to. We can use git_branch()
to list the existing branches and change to one of these, or create a new branch which starts with the current working directory. A simplification is made here that forbids changing branch without having committed all of your changes, because uncommitted changes could get irreversibly lost.
It is good practice to create new branches for each feature or set of changes you are going to make, and only when it’s finished do you merge it back into the main branch. This is what we shall do here with our ‘TEST’ change.
> git_branch()
Current active branch: master
New or existing branch name to move to (or hit ENTER to list branches) test-branch
> for(change_files in LETTERS[2:6]) write('TEST', file=change_files, append=TRUE)
> git_add()
Modified files to be added (+ created, - deleted, * changed since added, ? conflict)
1 B
2 C
3 D
4 E
5 F
Which file numbers to add? (Hit ENTER to add all non-conflicting, else ESCAPE)
> git_commit()
Using global (system default) identity: myname <myname@email>
Commit message: Experimental changes on test-branch
Commit to test-branch? (hit ESCAPE to cancel)
Done
> git_branch()
Current active branch: test-branch
New or existing branch name to move to (or hit ENTER to list branches)
[facbb7] (Local) master
[0f39df] (Local) (HEAD) test-branch
We can merge these changes back into our master branch, or make another sub- branch off this one. A common problem that comes up is if you are making changes on lots of branches, you might change the same file in different ways. This causes a conflict which it will try to resolve automatically, but if it cannot, it will highlight the code chunk with <<<<< >>>>>>
and prompt you to fix it by hand. You will not be allowed to git_commit()
before you have confirmed that you have fixed the conflicts and added the files by number. When a merge is complete, all of the commits from both branches will make up the shared history.
Return to main branch and write a change to file E
remembering that we have already made a conflicting change on test-branch and Git will not know which is correct.
> git_branch()
Current active branch: test-branch
New or existing branch name to move to (or hit ENTER to list branches) master
> write('This change will conflict with our test-branch', file='E')
> write('This change does not conflict with test-branch', file='H')
> git_add()
Modified files to be added (+ created, - deleted, * changed since added, ? conflict)
1 E
2 + H
Which file numbers to add? (Hit ENTER to add all non-conflicting, else ESCAPE)
Adding 2 file(s)
> git_commit()
Using global (system default) identity: myname <myname@email>
Commit message: Added a change to E and H which are not in test-branch
Commit to master? (hit ESCAPE to cancel)
Done
We can see the difference between the current state of the two branches, these will be the files which need to merge.
git_diff('',branch='test-branch', branch='master')
> git_merge()
Branch to merge into master: test-branch
Merge test-branch into master
Proceed? (Y/N) Y
Delete this branch after successful merge? (Y/N) Y
Merging test-branch into master
This merge will result with conflicts to resolve manually. Continue anyway? (Y/N) TRUE
The following files have conflicts:
E
Deleting branch test-branch
Open the conflicting files and resolve <<<???>>> now? (Y/N) Y
This then opens the file in an editor (default RStudio otherwise external) so that it can have the conflicting parts deleted. You should do this straight because the files will appear corrupted with all of the <<<<
=====
and >>>>
in them.
To make sure you have fixed each file, they will not automatically add when running git_add()
and each file marked ?
must be added by number.
> git_add()
Changes staged already
1 B
2 C
3 D
4 F
Any file numbers to un-add? (Hit ENTER to keep all)
Modified files to be added (+ created, - deleted, * changed since added, ? conflict)
1 ? E
Which file numbers to add? (Hit ENTER to add all non-conflicting, else ESCAPE) 1
Adding 1 file(s)
> git_commit()
Using global (system default) identity: myname <myname@email>
Commit message: Completed the merge of our test code
Commit to master? (hit ESCAPE to cancel)
Done
The history of the main branch now contains all commits from both branches, and another commit we did immediately after completing the merge.
> git_history()
1 [a92c7d2] 2022-01-20: Completed the merge of our test code
2 [b9ea6f1] 2022-01-20: Added a change to E and H which are not in test-branch
3 [0f39df6] 2022-01-20: Experimental changes on test-branch
4 [facbb79] 2022-01-20: Add some more files and made some changes
5 [d8759cf] 2022-01-20: My first commit
Occasionally things will go wrong with our code and we will want to hard reset or we merge / delete a branch by mistake. This is totally undo-able as long as it is done as soon as possible.
git_undo()
will show the recent history of actions that have happened locally (Git’s reflog) and allow you to restore the working directory to a previous commit checkpoint. It does not remember what branch you were on, so you may have to git_branch()
to the branch with the right name before restoring the old files.
Typical use is to wipe any changes since the last commit by resetting to the zero commit on the list. It will duly warn of the irreversibly loss (which can be checked with git_diff()
).
> file.remove('B', 'D', 'G')
[1] TRUE TRUE TRUE
> git_undo()
Not all changes have been committed! Run git_diff() to see what.
>>>>> Continuing will result in IRREVERSIBLE LOSS <<<<<
0 [a92c7d2] HEAD@{0}: commit (merge): Completed the merge of our test code
1 [b9ea6f1] HEAD@{1}: commit: Added a change to E and H which are not in test-branch
2 [facbb79] HEAD@{2}: checkout: moving from test-branch to master
3 [0f39df6] HEAD@{3}: commit: Experimental changes on test-branch
4 [facbb79] HEAD@{4}: checkout: moving from master to test-branch
5 [facbb79] HEAD@{5}: commit: Add some more files and made some changes
6 [d8759cf] HEAD@{6}: commit (initial): My first commit
Which commit to reset to? (Hit ESCAPE to cancel) 0
Selected: [a92c7d2] HEAD@{0}: commit (merge): Completed the merge of our test code
Confirm that you really want to change the working directory to historic state? (Y/N) Y
Happy days.
> list.files()
[1] "A" "B" "C" "D" "E" "F" "G" "H"