BIRCH/Developer resources/Git Usage-based Workflow
From Bioinformatics.Org Wiki
Contents |
Overview
Why use git?
A more detailed workflow summary with clarifications:
- BIRCH/Git for mere mortals a git quick-reference that summarizes what all of what you should need to know.
This page aims to provide the bare minimum information to get started with Git.
Summary
In general, the workflow is:
1. Modify -> Make the proposed change.
2. Stage -> Tell Git what files you modified that you want in your commit. To ensure that ALL files in your project are included, go to the top directory for the project and type
git add .
This may take awhile, if there are many subdirectories and files, because git has to identify which files are new, which have been changed, and which have been deleted.
3. Review -> Make sure you are committing only what you want to, and take this opportunity to clean-up your changes.
git status
4. Commit -> Set a checkpoint that you can go back to. Once code is committed, it can never be lost by a local Git repo.
git commit -a -m "quick description of the change"
5. Verify -> Verify your changes in the log before pushing them.
git log
6. Push -> Push the commit to the remote unstable branch Once code is pushed, it can be viewed by everyone. If you have a solid change that you want to keep as a potential "roll back" point, type
git push origin master
If you are experimenting and want to start a new branch, git push origin unstable
7. Test -> Make sure that the changes don't break anything.
8. Promote -> Mark the new changes as stable, and integrate them into the golden "stable" branch.
The workflow
Creating a repository
This is done with the command
git init .
It puts a .git folder in the current working directory, which is where Git keeps track of everything.
Note: This does not add any files, it just copies a Git skeleton into the current directory. It only needs to be done once in the lifetime of a repository.
After a repository has been created, you will be able to get it with the clone command:
git clone user@system:repo_name.git
Making changes to a repository
Stage your changes
You need to tell Git which files you want to keep track of. To keep track of all files it a folder, do
git add .
Or else add them individually. This stages the changes (writes them on a virtual card, which will become the commit).
Note: Whenever a new file is created, you must tell Git to track it with
git add MY_NEW_FILE
Removing files
To remove a tracked file that has already been committed, use the command
git rm MY_FILE_TO_REMOVE
This will both remove it from the filesystem, and stop tracking it within git. Note: All old versions of the file will remain as they were within Git, they are preserved for the life of the repository, and can be "undeleted
Review your changes
Before you commit your staged changes, you must review them.
There are two useful commands for this:
- Get a list of all changed files that are staged to be committed, as well as changed but not staged, and untracked:
git status
- Git a list of all lines changed in the staged files to be committed:
git diff
Note: "git diff" with no arguments will diff the staging area against the HEAD commit. It can be used to compare against any two commits, by specifying their SHA's or tags.
Commit your changes
By default, Git will commit all staged changes. It will not however stage all changes to be committed. For this reason, it is convenient to use the "-a" switch, which tells git to stage all changed files and commit them.
Each commit requires a commit message, which specifies a human readable summary of why the commit was made. These should be short and concise, something like "fixed bug #10", or "added feature X, untested, testing required". This is to make navigating the "history", or "commit log" more natural. This can be done at the command line using the "-m" switch, followed by a string, or else an editor will be launched.
To create a commit then:
git commit -a -m "My commit message"
This will create a commit object, that will be placed on the top of the history. The commit will be assigned a unique SHA-1 ID, which can be used to identify it. This will also set the HEAD ref to point to this ID.
If desired, a tag can be created to reference this commit more intelligibly.
Verify your changes
Before pushing changes to a remote repository, it is a best practice to verify them. The "git log" command is very useful for this. By default, it will display only the commit message, commit author, commit time, and commit SHA for a commit.
By adding the "-p" option, you can see exactly what changed in a commit. The "p" stands for "patch", showing which lines were added, and which were removed.
There are a lot of different ways to view the history, but at minimum I suggest reviewing the patch before pushing it.
git log -p git log VARIOUS OPTIONS
Push your branch to a remote repository
Note: you will need to submit your .ssh/id_rsa.pub (your public key) to the remote repo administrator to have push privileges
By default you will be pushing your master branch to the master branch on a remote repository. Assuming that you have committed your change to your local unstable branch, you can then push it to the origin machine (the remote repository) with
git push origin unstable
If you are working on a different branch, you must pull, or rebase, or merge your committed changes to the unstable branch first.
If the branch terminology is throwing you off, just keep these basic concepts in mind:
- You are always developing on some branch on your local machine, which one you do it in doesn't matter.
- You can always move a commit between branches easily.
- Switching branches takes no time, and is part of a good work flow.
Before you push for a first time, consult someone with a better understanding of Git than yourself, as they will be able to verify that you are doing it correctly. This will prevent damaging the history on the remote repository.
Undoing changes: using Git as a safety net
These are the basic use cases for when changes will need to be undone
Rolling a staged file back to HEAD
This is useful for undoing all your work on a file since the last commit.
git checkout MYFILE
In general, this is a clumsy but simple technique. It is much better to create a dummy commit, so that you can under you roll-back.
Unstaging your changes (safely)
This can be done to completely unstage all changes. This does not affect the files themselves, the working tree remains intact.
git reset HEAD
Unstaging your changes (dangerous)
This will unstage all of your changes, and revert back to a particular commit. In this case, HEAD is used. Do not take this lightly: it will throw away all uncommitted changes to the working tree.
git reset --hard HEAD
Promoting changes from unstable to stable branches
Once a change has been verified to be safe, the commit can be cherry-picked from the unstable branch to the stable branch. This is basically just saying "yes, these changes are tested and verified to be safe, let's mark them as such".
In general, it is best practice to get a second set of eyes to verify your code, to ensure that it is stable. It is often most practical though to simply test and verify the changes yourself, and be held accountable for any broken changes that are promoted to the stable branch.
Working with submodules
Submodules are simply references to other repositories. A specific version of a project consisting of submodules will have refs to the head of each of its submodules. Essentially, this means that the super-project "birchdev" will eventually just be a set of empty directories, each referencing a repository.
For instance, "biolegato" is now a submodule. When you clone "birchdev" you will see that "java/biolegato" is simply an empty directory. If you do
cd java/biolegato && git submodule init && git submodule update
You will populate the folder with the ref specified by the superprojects submodule head refs.
For more information, read over http://progit.org/book/ch6-6.html
Development workflow with Branches
Branches are like walls that act as a defense against bad code. By committing all development code to an "unstable" branch, everyone can contribute safely to the master branch by only promoting their stable code to it.
Getting your head around using branches can be tough, I recommend you read over [1].
Basically, a branch can be created instantly because it is just like a tag - it is a pointer to a specific commit. Branching allows for you to easily create a sandbox, like "awesomefix" seen below.
You can get a list of existing branches for a repository with
git branch
To view your branches graphically, use
gitk
Essentially, it works like this.
- You get a new feature/bugfix that you want to code
- You checkout the unstable branch with
git checkout unstable # if the unstable branch doesn't exist, git checkout -b unstable will create it
- You are now at state #2 below. You checkout a new branch for your feature, lets call it awesomefix
git checkout -b awesomefix
- You are now at state #3 below. You code up your awesome fix, and test it
- You decide your most recent commit works.
- You rebase your fix onto the unstable branch with
git checkout unstable && git rebase awesomefix
- You are now at state #4 below. Your awesomefix is now on the unstable branch.
- You test out the unstable branch, and decide the latest commit is stable
- You rebase your fix onto master
git checkout master && git rebase unstable
- You are now at state #5 below, and the heads of master, unstable, and awesomefix are all pointing to the same commit ref as below).