BIRCH/Git for mere mortals

From Bioinformatics.Org Wiki

Jump to: navigation, search

Please glance over this page on what git is not


Contents

Developer Workflow

The bare minimum

When you make a new file, add it with:

git add PATH_TO_FILE_OR_FOLDER # note that this does not need to be absolute, it is relative to the current working directory as long as a parent directory is a git repository. Folders are added recursively.

When you make a new folder that is intended to be empty, you must put something in it:

mkdir myemptyfolder
cd myemptyfolder
echo "">.gitignore
git add .gitignore

When you are done modifying files:

git commit -a -m "Summary of what I did"

That is it! You are working with git!


More advanced workflow

Branching

It makes sense to utilize branches as often as possible. I suggest you read the beginner section of gitready before you attempt this, as you might get confused. Also, take a look at progit's section on branches

Suppose you are working on a feature "awesome_idea". Before you start your work, create a branch and switch to it with:

git checkout -b awesomeidea

Note that if you have uncommitted changes on your current branch, you will need to either commit them or stash them first.

Do your work on awesomeidea, test your changes, then commit them to your current branch. Then, switch back to your previous branch, and merge them:

git branch #lists the branches in the current repository, choose your original branch. Suppose it is called "master"
git checkout master
git merge awesomeidea

Thats it! You're done! No headaches, and you kept awesome idea separate so that if it turned out to be too hard, or you needed to leave it unfinished, your changes would be all in one place. One final housekeeping thing is to delete the branch:

git branch -D awesomeidea

Pushing

In order to push, your branch must be in synch with the branch you want to push to first. This is done by a "pull"

git pull --rebase upstream master

Can be used to:

  1. fetch changes from the repository "upstream" for branch "master"
  2. "rewind" your changes to the last time the two "master" branches were in synch, and "fast-forward" the changes from upstream master, then re-apply your changes on top.

Once you are in synch, you should be able to verify this:

git fetch upstream master && git log upstream/master 
git log master #you should see your changes applied on top of those from upstream master.

They should each have the same SHA for their top commit. You are no ready to push.

git push upstream master

This pushes your changes to the upstream repositories master branch. You can verify that the push was successful with:

git fetch upstream master && git log upstream/master 
git log master 

Both logs should have the same commit at the top.


Troubleshooting

Pushes are done over ssh: the repository that you want to commit to must have your public key or else it will assume you are malevolent. If you get any error messages, email them to me with your public key.

Command references

Add to this as you go.

git help TOPIC is really useful, use it!

To merge X commits back from head:

git rebase -i HEAD~X


Git commits, branches, and merges: Just a stack of cards

A simple way to visualize how git works is by representing each commit as a card. When you make a commit, all of the changes that you made are stored in that card. This card is, by default, added to the top of the stack of other commits, card, that have been made on the repository.

Lets call the repository stack the "master" stack.

When you branch, you make another stack, and start piling your cards on there. Others can still pile cards on the master stack though, and you can work on as many branch stacks as you want. Whenever you want, you can simply grab a branch and stack it back on top of the master stack, this is how a merge works.

Git treats all branches as equal, meaning you can merge any branches you want. Since commits are atomic, think cards, anything that you can do with cards that make sense you can do with commits. For this reason, it makes sense to commit as often as possible, keeping your commits logically partitioned, so that you can shuffle your changes around however you please.


Underlying implementation and definitions

Git uses an index that it uses to keep track of staged files. When you have staged all of your changes, you commit them to the object store.

staged files: These are files that are under version control (Git has been told to watch them, and they've been modified). You can view staged files by executing "git status", the output is self-explanatory.

index: Where objects (blobs) go when they are staged. When a commit occurs, these changes enter the object store

object store: All committed changes and git history is kept in here. Once something is in the object store, it is impossible to lose that data short of destroying the object store itself before it is pushed to an upstream repository.

commits: these show up in the git log. A commit is an incremental set of changes pushed into the object store. Each commit is uniquely identifiable by a SHA1 checksum. The current commit is referred to as "HEAD". The previous commit can be easily obtained with "HEAD^", and the one before that with "HEAD^^". A different notation to get the commit X back from HEAD is "HEAD~X" (instead of having a huge number of ^'s). The git log provides a visual representation of your commit history.

pushes: A push sends all commits that are on your copy up to a repository. It is important to synch and rebase before a push. Pushing is more complicated than committing, so don't worry about it if you don't understand it yet.

branches: On any given git repository, you can have as many branches as you like. Branches only differ by which commit they use as their head reference. Branches are very easy to merge: the only time that things get troublesome is when you modify the same part of the same file in multiple places.


Warnings

If you are working with Git on Windows, (please don't...) make SURE that you do NOT allow ANY files to be formatted with dos line endings. Doing so tricks the repository into thinking that the entire file has been modified! This breaks features like using git diffs, and git patches.

Commit as often as possible: Git prevents you from losing your code, but it only guarantees it if you commit it. Commits are relatively fast and are your friend! Similarly, use branches for unstable code/features that aren't yet working to keep the master branch clean.

Personal tools
Namespaces
Variants
Actions
wiki navigation
Toolbox