Module 0.2
Git & GitHub
What is Version Control?
Version control is a system that records every change to your files over time, letting you recall any previous version. Think of it as an infinite undo button β not just for one file, but for your entire project, going back to the very first line of code.
Git is the version control system that won. Created by Linus Torvalds (the Linux creator) in 2005, it's now used by virtually every software project on Earth. GitHub is a cloud platform that hosts Git repositories and adds collaboration features like pull requests and issue tracking.
Real-world usage: Every open-source project on GitHub uses Git. Companies like Google, Microsoft, and Anthropic use it for all their code. AI projects track model configs, training scripts, and experiment results with Git. If you're writing code and not using Git, you're working without a safety net.
Key terms:
- Repository (repo) β a project folder tracked by Git. Contains your files plus a hidden
.git/folder with the full history. - Commit β a snapshot of your project at a point in time. Like a save point in a video game.
- Branch β a parallel version of your project. You create branches to work on features without touching the main code.
- Remote β a copy of your repo hosted somewhere else (usually GitHub). Your local repo pushes to and pulls from remotes.
The Mental Model
Think of Git like a photo album for your code:
- You make changes to files (editing, adding, deleting)
- You stage the changes you want to capture (pick which photos go in the album)
- You commit β take the snapshot (paste them into the album with a caption)
- You push β upload the album page to GitHub so others can see it
Working Directory β Staging Area β Local Repository β Remote (GitHub)
(your files) (git add) (git commit) (git push)
Every commit has a unique hash (like a3f2b7c), a message describing what changed, and a pointer to the previous commit. This chain of commits is your project history.
Core Git Workflow
# First time setup (once ever)
git config --global user.name "Your Name"
git config --global user.email "you@example.com"
# Starting a project
git init # Create a new repo in current directory
git clone <url> # OR download an existing repo from GitHub
# The daily cycle
git status # What's changed? What's staged?
git add main.py # Stage specific file
git add . # Stage everything (use carefully)
git commit -m "Add agent loop with error handling"
git push # Upload to GitHub
# Checking history
git log --oneline # Compact commit history
git diff # See unstaged changes
git diff --staged # See staged changesgit status is your best friend. Running it before and after major operations builds good habits. It tells you exactly what state your project is in β what's changed, what's staged, what's committed.
Branches
Branches let you work on features in isolation. The main branch is your source of truth. You create feature branches, do your work there, then merge back when done.
git branch # List branches (star marks current)
git checkout -b feature/auth # Create and switch to new branch
git checkout main # Switch back to main
git merge feature/auth # Merge feature branch into current branch
git branch -d feature/auth # Delete branch after mergingThe mental model: Think of branches as parallel timelines. You branch off, make changes in your timeline, then merge the timelines back together. If two people changed the same line in different timelines, Git flags a merge conflict and asks you to resolve it manually.
Why it matters for agents: You'll branch constantly β one branch per experiment, one per feature, one per bug fix. When building AI agents, you might have experiment/rag-chunking-512 and experiment/rag-chunking-1024 running simultaneously. Git lets you compare results across branches.
.gitignore
The .gitignore file tells Git which files to never track. You put it in your repo root.
# Python
__pycache__/
*.pyc
.venv/
# Environment secrets
.env
.env.local
# OS files
.DS_Store
Thumbs.db
# IDE
.vscode/settings.json
.idea/
# Dependencies
node_modules/Why it matters: Two things should never be in Git: secrets (API keys, passwords) and generated files (dependencies, compiled code, cache). Pushing an API key to a public GitHub repo is a common and expensive mistake β bots scan GitHub constantly and will find it within minutes.
GitHub and Pull Requests
GitHub adds a collaboration layer on top of Git. The most important feature is the pull request (PR) β a proposal to merge one branch into another.
The PR workflow:
- Create a branch and make your changes
- Push the branch to GitHub
- Open a PR on GitHub β describe what you changed and why
- Teammates review your code, leave comments
- You address feedback, push more commits
- PR gets approved and merged into
main
# Push a new branch to GitHub
git push -u origin feature/auth
# Then open a PR on GitHub's web interface
# (or use the GitHub CLI: gh pr create)PRs create an auditable record of every change. You can see who changed what, why, and who approved it. In agent development, this matters for tracking configuration changes that affect model behavior.