zzz Aditya here

Git Internals

Git might seem like some hidden computer magic if you don’t understand how it works inside. Let’s break it down in simpler terms for a naive idea on it:

.git folder

Where does Git store all of its histories? It would be really hard to guess if your hidden folders are hidden in the IDE you use.

Everytime you initialize a repo by “git init” command, a hidden folder “.git” is created within your repo, this is the warehouse for git to store everything about your project.

Initially it would look something like this,

 1$ ls -1a .git
 2./
 3../
 4config
 5description
 6FETCH_HEAD
 7HEAD
 8hooks/
 9info/
10objects/
11refs/

We don’t really know what these directories and files are, let’s have a look and how they allow git to track all the modifications so smoothly.

Config

It contains configurations specific to you current repo and overrides system Git configurations.

Description

Short text about your project.

It is a pointer to the branch you’re working on right now.

Hooks

You can write bash scripts here that run after a Git event is triggered. Some of these events are post update, pre commit, pre rebase and more.

Info

It is used to tell Git to ignore some files locally that you don’t wish to specify in .gitignore file.

Objects

It’s the directory that stores all Git objects with subdirectories and files named after object’s commit hashes. We’ll talk more about Git objects further on.

Refs

Contains references to commit hashes of heads and tags of all the branches.

Logs

As we initialized a Git repo just now and haven’t done anything it’s not shown above but it appears as soon you make a commit and tracks all movements of head with their commit hashes.

Index

It contains the data of tree after changes have been staged. We’ll know about trees just beneath.

Git Objects

Blobs

Blobs are the content you have in your file. It’s hash is calculated based on the content.

Trees

Trees are the directories of your project, a tree can contain sub trees within itself with blobs. It contains hashes of every object present in it and yes, tree has a hash of itself as well,

Commits

Commit objects are references to the tree of current snapshot of your repo. Commit’s hash contains more information like author’s info and commit message.

You can’t open Git objects normally but there is a command “git cat-file”, it allows you to lookup within the object and check it’s content.

Git Hashing

Git uses SHA-1 hashing to generate a unique hash for every Git object, these hashes are generated based on the content of the object. They allow Git to track and distinguish between different objects.

Workflow on how a commit is saved

1git add
2Local → Blob → Index
3
4git commit
5Index → Tree → Commit → HEAD

Staging Changes

Committing Changes