-With that, you should now be having some inkling of what git does, and
-can explore on your own.
-
-
-[ Side note: most likely, you are not directly using the core
- git Plumbing commands, but using Porcelain like Cogito on top
- of it. Cogito works a bit differently and you usually do not
- have to run "git-update-cache" yourself for changed files (you
- do tell underlying git about additions and removals via
- "cg-add" and "cg-rm" commands). Just before you make a commit
- with "cg-commit", Cogito figures out which files you modified,
- and runs "git-update-cache" on them for you. ]
-
-
- Tagging a version
- -----------------
-
-In git, there's two kinds of tags, a "light" one, and a "signed tag".
-
-A "light" tag is technically nothing more than a branch, except we put
-it in the ".git/refs/tags/" subdirectory instead of calling it a "head".
-So the simplest form of tag involves nothing more than
-
- cat .git/HEAD > .git/refs/tags/my-first-tag
-
-after which point you can use this symbolic name for that particular
-state. You can, for example, do
-
- git diff my-first-tag
-
-to diff your current state against that tag (which at this point will
-obviously be an empty diff, but if you continue to develop and commit
-stuff, you can use your tag as a "anchor-point" to see what has changed
-since you tagged it.
-
-A "signed tag" is actually a real git object, and contains not only a
-pointer to the state you want to tag, but also a small tag name and
-message, along with a PGP signature that says that yes, you really did
-that tag. You create these signed tags with
-
- git tag <tagname>
-
-which will sign the current HEAD (but you can also give it another
-argument that specifies the thing to tag, ie you could have tagged the
-current "mybranch" point by using "git tag <tagname> mybranch").
-
-You normally only do signed tags for major releases or things
-like that, while the light-weight tags are useful for any marking you
-want to do - any time you decide that you want to remember a certain
-point, just create a private tag for it, and you have a nice symbolic
-name for the state at that point.
-
-
- Copying archives
- -----------------
-
-Git archives are normally totally self-sufficient, and it's worth noting
-that unlike CVS, for example, there is no separate notion of
-"repository" and "working tree". A git repository normally _is_ the
-working tree, with the local git information hidden in the ".git"
-subdirectory. There is nothing else. What you see is what you got.
-
-[ Side note: you can tell git to split the git internal information from
- the directory that it tracks, but we'll ignore that for now: it's not
- how normal projects work, and it's really only meant for special uses.
- So the mental model of "the git information is always tied directly to
- the working directory that it describes" may not be technically 100%
- accurate, but it's a good model for all normal use ]
-
-This has two implications:
-
- - if you grow bored with the tutorial archive you created (or you've
- made a mistake and want to start all over), you can just do simple
-
- rm -rf git-tutorial
-
- and it will be gone. There's no external repository, and there's no
- history outside of the project you created.
-
- - if you want to move or duplicate a git archive, you can do so. There
- is "git clone" command, but if all you want to do is just to
- create a copy of your archive (with all the full history that
- went along with it), you can do so with a regular
- "cp -a git-tutorial new-git-tutorial".
-
- Note that when you've moved or copied a git archive, your git index
- file (which caches various information, notably some of the "stat"
- information for the files involved) will likely need to be refreshed.
- So after you do a "cp -a" to create a new copy, you'll want to do
-
- git-update-cache --refresh
-
- to make sure that the index file is up-to-date in the new one.
-
-Note that the second point is true even across machines. You can
-duplicate a remote git archive with _any_ regular copy mechanism, be it
-"scp", "rsync" or "wget".
-
-When copying a remote repository, you'll want to at a minimum update the
-index cache when you do this, and especially with other peoples
-repositories you often want to make sure that the index cache is in some
-known state (you don't know _what_ they've done and not yet checked in),
-so usually you'll precede the "git-update-cache" with a
-
- git-read-tree --reset HEAD
- git-update-cache --refresh
-
-which will force a total index re-build from the tree pointed to by HEAD
-(it resets the index contents to HEAD, and then the git-update-cache
-makes sure to match up all index entries with the checked-out files).
-
-The above can also be written as simply
-
- git reset
-
-and in fact a lot of the common git command combinations can be scripted
-with the "git xyz" interfaces, and you can learn things by just looking
-at what the git-*-script scripts do ("git reset" is the above two lines
-implemented in "git-reset-script", but some things like "git status" and
-"git commit" are slightly more complex scripts around the basic git
-commands).
-
-NOTE! Many (most?) public remote repositories will not contain any of
-the checked out files or even an index file, and will _only_ contain the
-actual core git files. Such a repository usually doesn't even have the
-".git" subdirectory, but has all the git files directly in the
-repository.
-
-To create your own local live copy of such a "raw" git repository, you'd
-first create your own subdirectory for the project, and then copy the
-raw repository contents into the ".git" directory. For example, to
-create your own copy of the git repository, you'd do the following
-
- mkdir my-git
- cd my-git
- rsync -rL rsync://rsync.kernel.org/pub/scm/git/git.git/ my-git .git
-
-followed by
-
- git-read-tree HEAD
-
-to populate the index. However, now you have populated the index, and
-you have all the git internal files, but you will notice that you don't
-actually have any of the _working_directory_ files to work on. To get
-those, you'd check them out with
-
- git-checkout-cache -u -a
-
-where the "-u" flag means that you want the checkout to keep the index
-up-to-date (so that you don't have to refresh it afterward), and the
-"-a" flag means "check out all files" (if you have a stale copy or an
-older version of a checked out tree you may also need to add the "-f"
-flag first, to tell git-checkout-cache to _force_ overwriting of any old
-files).
-
-Again, this can all be simplified with
-
- git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ my-git
- cd my-git
- git checkout
-
-which will end up doing all of the above for you.
-
-You have now successfully copied somebody else's (mine) remote
-repository, and checked it out.
-
-
- Creating a new branch
- ---------------------
-
-Branches in git are really nothing more than pointers into the git
-object space from within the ",git/refs/" subdirectory, and as we
-already discussed, the HEAD branch is nothing but a symlink to one of
-these object pointers.
-
-You can at any time create a new branch by just picking an arbitrary
-point in the project history, and just writing the SHA1 name of that
-object into a file under .git/refs/heads/. You can use any filename you
-want (and indeed, subdirectories), but the convention is that the
-"normal" branch is called "master". That's just a convention, though,
-and nothing enforces it.
-
-To show that as an example, let's go back to the git-tutorial archive we
-used earlier, and create a branch in it. You literally do that by just
-creating a new SHA1 reference file, and switch to it by just making the
-HEAD pointer point to it:
-
- cat .git/HEAD > .git/refs/heads/mybranch
- ln -sf refs/heads/mybranch .git/HEAD
-
-and you're done.
-
-Now, if you make the decision to start your new branch at some other
-point in the history than the current HEAD, you usually also want to
-actually switch the contents of your working directory to that point
-when you switch the head, and "git checkout" will do that for you:
-instead of switching the branch by hand with "ln -sf", you can just do
-
- git checkout mybranch
-
-which will basically "jump" to the branch specified, update your working
-directory to that state, and also make it become the new default HEAD.
-
-You can always just jump back to your original "master" branch by doing
-
- git checkout master
-
-and if you forget which branch you happen to be on, a simple
-
- ls -l .git/HEAD
-
-will tell you where it's pointing.
-
-
- Merging two branches
- --------------------
-
-One of the ideas of having a branch is that you do some (possibly
-experimental) work in it, and eventually merge it back to the main
-branch. So assuming you created the above "mybranch" that started out
-being the same as the original "master" branch, let's make sure we're in
-that branch, and do some work there.
-
- git checkout mybranch
- echo "Work, work, work" >> a
- git commit a
-
-Here, we just added another line to "a", and we used a shorthand for
-both going a "git-update-cache a" and "git commit" by just giving the
-filename directly to "git commit".
-
-Now, to make it a bit more interesting, let's assume that somebody else
-does some work in the original branch, and simulate that by going back
-to the master branch, and editing the same file differently there:
-
- git checkout master
-
-Here, take a moment to look at the contents of "a", and notice how they
-don't contain the work we just did in "mybranch" - because that work
-hasn't happened in the "master" branch at all. Then do
-
- echo "Play, play, play" >> a
- echo "Lots of fun" >> b
- git commit a b
-
-since the master branch is obviously in a much better mood.
-
-Now, you've got two branches, and you decide that you want to merge the
-work done. Before we do that, let's introduce a cool graphical tool that
-helps you view what's going on:
-
- gitk --all
-
-will show you graphically both of your branches (that's what the "--all"
-means: normally it will just show you your current HEAD) and their
-histories. You can also see exactly how they came to be from a common
-source.
-
-Anyway, let's exit gitk (^Q or the File menu), and decide that we want
-to merge the work we did on the "mybranch" branch into the "master"
-branch (which is currently our HEAD too). To do that, there's a nice
-script called "git resolve", which wants to know which branches you want
-to resolve and what the merge is all about:
-
- git resolve HEAD mybranch "Merge work in mybranch"
-
-where the third argument is going to be used as the commit message if
-the merge can be resolved automatically.
-
-Now, in this case we've intentionally created a situation where the
-merge will need to be fixed up by hand, though, so git will do as much
-of it as it can automatically (which in this case is just merge the "b"
-file, which had no differences in the "mybranch" branch), and say:
-
- Simple merge failed, trying Automatic merge
- Auto-merging a.
- merge: warning: conflicts during merge
- ERROR: Merge conflict in a.
- fatal: merge program failed
- Automatic merge failed, fix up by hand
-
-which is way too verbose, but it basically tells you that it failed the
-really trivial merge ("Simple merge") and did an "Automatic merge"
-instead, but that too failed due to conflicts in "a".
-
-Not to worry. It left the (trivial) conflict in "a" in the same form you
-should already be well used to if you've ever used CVS, so let's just
-open "a" in our editor (whatever that may be), and fix it up somehow.
-I'd suggest just making it so that "a" contains all four lines:
-
- Hello World
- It's a new day for git
- Play, play, play
- Work, work, work
-
-and once you're happy with your manual merge, just do a
-
- git commit a
-
-which will very loudly warn you that you're now committing a merge
-(which is correct, so never mind), and you can write a small merge
-message about your adventures in git-merge-land.
-
-After you're done, start up "gitk --all" to see graphically what the
-history looks like. Notive that "mybranch" still exists, and you can
-switch to it, and continue to work with it if you want to. The
-"mybranch" branch will not contain the merge, but next time you merge it
-from the "master" branch, git will know how you merged it, so you'll not
-have to do _that_ merge again.
-
-
- Merging external work
- ---------------------
-
-It's usually much more common that you merge with somebody else than
-merging with your own branches, so it's worth pointing out that git
-makes that very easy too, and in fact, it's not that different from
-doing a "git resolve". In fact, a remote merge ends up being nothing
-more than "fetch the work from a remote repository into a temporary tag"
-followed by a "git resolve".
-
-It's such a common thing to do that it's called "git pull", and you can
-simply do
-
- git pull <remote-repository>
-
-and optionally give a branch-name for the remote end as a second
-argument.
-
-The "remote" repository can even be on the same machine. One of
-the following notations can be used to name the repository to
-pull from:
-
- Rsync URL
- rsync://remote.machine/path/to/repo.git/
-
- HTTP(s) URL
- http://remote.machine/path/to/repo.git/
-
- GIT URL
- git://remote.machine/path/to/repo.git/
- remote.machine:/path/to/repo.git/
-
- Local directory
- /path/to/repo.git/
-
-[ Side Note: currently, HTTP transport is slightly broken in
- that when the remote repository is "packed" they do not always
- work. But we have not talked about packing repository yet, so
- let's not worry too much about it for now. ]
-
-[ Digression: you could do without using any branches at all, by
- keeping as many local repositories as you would like to have
- branches, and merging between them with "git pull", just like
- you merge between branches. The advantage of this approach is
- that it lets you keep set of files for each "branch" checked
- out and you may find it easier to switch back and forth if you
- juggle multiple lines of development simultaneously. Of
- course, you will pay the price of more disk usage to hold
- multiple working trees, but disk space is cheap these days. ]
-
-It is likely that you will be pulling from the same remote
-repository from time to time. As a short hand, you can store
-the remote repository URL in a file under .git/branches/
-directory, like this:
-
- mkdir -p .git/branches
- echo rsync://kernel.org/pub/scm/git/git.git/ \
- >.git/branches/linus
-
-and use the filenae to "git pull" instead of the full URL.
-The contents of a file under .git/branches can even be a prefix
-of a full URL, like this:
-
- echo rsync://kernel.org/pub/.../jgarzik/
- >.git/branches/jgarzik
-
-Examples.
-
- (1) git pull linus
- (2) git pull linus tag v0.99.1
- (3) git pull jgarzik/netdev-2.6.git/ e100
-
-the above are equivalent to:
-
- (1) git pull rsync://kernel.org/pub/scm/git/git.git/ HEAD
- (2) git pull rsync://kernel.org/pub/scm/git/git.git/ tag v0.99.1
- (3) git pull rsync://kernel.org/pub/.../jgarzik/netdev-2.6.git e100
-
-
- Publishing your work
- --------------------
-
-So we can use somebody else's work from a remote repository; but
-how can _you_ prepare a repository to let other people pull from
-it?
-
-Your do your real work in your working directory that has your
-primary repository hanging under it as its ".git" subdirectory.
-You _could_ make that repository accessible remotely and ask
-people to pull from it, but in practice that is not the way
-things are usually done. A recommended way is to have a public
-repository, make it reachable by other people, and when the
-changes you made in your primary working directory are in good
-shape, update the public repository from it. This is often
-called "pushing".
-
-[ Side note: this public repository could further be mirrored,
- and that is how kernel.org git repositories are done. ]
-
-Publishing the changes from your local (private) repository to
-your remote (public) repository requires a write privilege on
-the remote machine. You need to have an SSH account there to
-run a single command, "git-receive-pack".
-
-First, you need to create an empty repository on the remote
-machine that will house your public repository. This empty
-repository will be populated and be kept up-to-date by pushing
-into it later. Obviously, this repository creation needs to be
-done only once.
-
-[ Digression: "git push" uses a pair of programs,
- "git-send-pack" on your local machine, and "git-receive-pack"
- on the remote machine. The communication between the two over
- the network internally uses an SSH connection. ]
-
-Your private repository's GIT directory is usually .git, but
-your public repository is often named after the project name,
-i.e. "<project>.git". Let's create such a public repository for
-project "my-git". After logging into the remote machine, create
-an empty directory:
-
- mkdir my-git.git
-
-Then, make that directory into a GIT repository by running
-git-init-db, but this time, since it's name is not the usual
-".git", we do things slightly differently:
-
- GIT_DIR=my-git.git git-init-db
-
-Make sure this directory is available for others you want your
-changes to be pulled by via the transport of your choice. Also
-you need to make sure that you have the "git-receive-pack"
-program on the $PATH.
-
-[ Side note: many installations of sshd do not invoke your shell
- as the login shell when you directly run programs; what this
- means is that if your login shell is bash, only .bashrc is
- read and not .bash_profile. As a workaround, make sure
- .bashrc sets up $PATH so that you can run 'git-receive-pack'
- program. ]
-
-Your "public repository" is now ready to accept your changes.
-Come back to the machine you have your private repository. From
-there, run this command:
-
- git push <public-host>:/path/to/my-git.git master
-
-This synchronizes your public repository to match the named
-branch head (i.e. "master" in this case) and objects reachable
-from them in your current repository.
-
-As a real example, this is how I update my public git
-repository. Kernel.org mirror network takes care of the
-propagation to other publicly visible machines:
-
- git push master.kernel.org:/pub/scm/git/git.git/
-
-
-[ Digression: your GIT "public" repository people can pull from
- is different from a public CVS repository that lets read-write
- access to multiple developers. It is a copy of _your_ primary
- repository published for others to use, and you should not
- push into it from more than one repository (this means, not
- just disallowing other developers to push into it, but also
- you should push into it from a single repository of yours).
- Sharing the result of work done by multiple people are always
- done by pulling (i.e. fetching and merging) from public
- repositories of those people. Typically this is done by the
- "project lead" person, and the resulting repository is
- published as the public repository of the "project lead" for
- everybody to base further changes on. ]
-
-
- Packing your repository
- -----------------------
-
-Earlier, we saw that one file under .git/objects/??/ directory
-is stored for each git object you create. This representation
-is convenient and efficient to create atomically and safely, but
-not so to transport over the network. Since git objects are
-immutable once they are created, there is a way to optimize the
-storage by "packing them together". The command
-
- git repack
-
-will do it for you. If you followed the tutorial examples, you
-would have accumulated about 17 objects in .git/objects/??/
-directories by now. "git repack" tells you how many objects it
-packed, and stores the packed file in .git/objects/pack
-directory.
-
-[ Side Note: you will see two files, pack-*.pack and pack-*.idx,
- in .git/objects/pack directory. They are closely related to
- each other, and if you ever copy them by hand to a different
- repository for whatever reason, you should make sure you copy
- them together. The former holds all the data from the objects
- in the pack, and the latter holds the index for random
- access. ]
-
-If you are paranoid, running "git-verify-pack" command would
-detect if you have a corrupt pack, but do not worry too much.
-Our programs are always perfect ;-).
-
-Once you have packed objects, you do not need to leave the
-unpacked objects that are contained in the pack file anymore.
-
- git prune-packed
-
-would remove them for you.
-
-You can try running "find .git/objects -type f" before and after
-you run "git prune-packed" if you are curious.
-
-[ Side Note: as we already mentioned, "git pull" is broken for
- some transports dealing with packed repositories right now, so
- do not run "git prune-packed" if you plan to give "git pull"
- access via HTTP transport for now. ]
-
-If you run "git repack" again at this point, it will say
-"Nothing to pack". Once you continue your development and
-accumulate the changes, running "git repack" again will create a
-new pack, that contains objects created since you packed your
-archive the last time. We recommend that you pack your project
-soon after the initial import (unless you are starting your
-project from scratch), and then run "git repack" every once in a
-while, depending on how active your project is.
-
-When a repository is synchronized via "git push" and "git pull",
-objects packed in the source repository is usually stored
-unpacked in the destination, unless rsync transport is used.
-
-
- Working with Others
- -------------------
-
-Although git is a truly distributed system, it is often
-convenient to organize your project with an informal hierarchy
-of developers. Linux kernel development is run this way. There
-is a nice illustration (page 17, "Merges to Mainline") in Randy
-Dunlap's presentation (http://tinyurl.com/a2jdg).
-
-It should be stressed that this hierarchy is purely "informal".
-There is nothing fundamental in git that enforces the "chain of
-patch flow" this hierarchy implies. You do not have to pull
-from only one remote repository.
-
-
-A recommended workflow for a "project lead" goes like this:
-
- (1) Prepare your primary repository on your local machine. Your
- work is done there.
-
- (2) Prepare a public repository accessible to others.
-
- (3) Push into the public repository from your primary
- repository.
-
- (4) "git repack" the public repository. This establishes a big
- pack that contains the initial set of objects as the
- baseline, and possibly "git prune-packed" if the transport
- used for pulling from your repository supports packed
- repositories.
-
- (5) Keep working in your primary repository. Your changes
- include modifications of your own, patches you receive via
- e-mails, and merges resulting from pulling the "public"
- repositories of your "subsystem maintainers".
-
- You can repack this private repository whenever you feel
- like.
-
- (6) Push your changes to the public repository, and announce it
- to the public.
-
- (7) Every once in a while, "git repack" the public repository.
- Go back to step (5) and continue working.
-
-
-A recommended work cycle for a "subsystem maintainer" that works
-on that project and has own "public repository" goes like this:
-
- (1) Prepare your work repository, by "git clone" the public
- repository of the "project lead". The URL used for the
- initial cloning is stored in .git/branches/origin.
-
- (2) Prepare a public repository accessible to others.
-
- (3) Copy over the packed files from "project lead" public
- repository to your public repository by hand; this part is
- currently not automated.
-
- (4) Push into the public repository from your primary
- repository. Run "git repack", and possibly "git
- prune-packed" if the transport used for pulling from your
- repository supports packed repositories.
-
- (5) Keep working in your primary repository. Your changes
- include modifications of your own, patches you receive via
- e-mails, and merges resulting from pulling the "public"
- repositories of your "project lead" and possibly your
- "sub-subsystem maintainers".
-
- You can repack this private repository whenever you feel
- like.
-
- (6) Push your changes to your public repository, and ask your
- "project lead" and possibly your "sub-subsystem
- maintainers" to pull from it.
-
- (7) Every once in a while, "git repack" the public repository.
- Go back to step (5) and continue working.
-
-
-A recommended work cycle for an "individual developer" who does
-not have a "public" repository is somewhat different. It goes
-like this:
-
- (1) Prepare your work repository, by "git clone" the public
- repository of the "project lead" (or a "subsystem
- maintainer", if you work on a subsystem). The URL used for
- the initial cloning is stored in .git/branches/origin.
-
- (2) Do your work there. Make commits.
-
- (3) Run "git fetch origin" from the public repository of your
- upstream every once in a while. This does only the first
- half of "git pull" but does not merge. The head of the
- public repository is stored in .git/refs/heads/origin.
-
- (4) Use "git cherry origin" to see which ones of your patches
- were accepted, and/or use "git rebase origin" to port your
- unmerged changes forward to the updated upstream.
-
- (5) Use "git format-patch origin" to prepare patches for e-mail
- submission to your upstream and send it out. Go back to
- step (2) and continue.
-
-
-[ to be continued.. cvsimports ]