Managing Complex Integrations Through Git

Essential Git Commands for Complex Integrations

This guide provides solutions for managing complex Git integrations, particularly in monorepos with multiple submodules. The commands below are essential tools for your workflow.

Quick Command Reference

Amending Commits

bash
git commit --amend --no-edit

Force Pushing (use with caution)

bash
# Standard force push (dangerous)
git push --force origin development

# Safer force push (recommended)
git push --force-with-lease origin development

Cleaning and Cache Management

bash
# Remove directories from Git cache
git rm -r --cached scripts site_archive

# Delete backup files
find content/changelog--code -name "*.bak" -type f -delete

# Create backup files
for file in content/changelog--code/*.md; do cp "$file" "${file}.bak"; done

Managing Submodule Branch Tracking

The Challenge: Synchronizing Branch States

When managing a monorepo with multiple submodules, we need to ensure that:
  • The development branch tracks development branches of submodules
  • The master branch tracks master branches of submodules
This requires updating multiple lines in the .gitmodules file when switching branches.

Solution: Using Stream Editor (sed)

Basic Workflow

bash
# First, create a backup of .gitmodules
cp .gitmodules .gitmodules.bak

# Preview changes (prints what would change without modifying the file)
sed 's/branch = development/branch = master/g' .gitmodules | diff .gitmodules -

# If the preview looks correct, apply changes
sed -i '' 's/branch = development/branch = master/g' .gitmodules

Understanding the Command

  1. sed - Stream EDitor, processes text line by line
  2. -i '' - In-place edit flag (empty quotes required on macOS)
  3. 's/branch = development/branch = master/g'
    • s/ starts a substitution
    • branch = development is the pattern to find
    • branch = master is the replacement
    • /g means global (replace all occurrences)

Safety Notes

  1. Always create a backup before modifying .gitmodules
  2. Preview changes using diff before applying
  3. The mdbook submodule (external tool) should have no branch specification
  4. To revert: either restore from backup or swap 'master' and 'development' in the command

Real-World Example: Merging Development into Master

The Challenge

We needed to consolidate all the development work from various submodules and the monorepo into their respective master branches. This involved:
  1. Merging development changes in each submodule to their master branches
  2. Updating the monorepo's master branch to point to these new master states
  3. Ensuring all submodules track their master branches when the monorepo is on master

Learning Through Failure: Our Attempts

Attempt 1: Direct Merge with Submodules

bash
git checkout master
git merge development --no-ff
Failed because: Submodules were still pointing to development branches, causing conflicts.

Attempt 2: Trying to Clean Untracked Files

bash
git clean -f && git checkout master
Failed because: Couldn't handle nested .git directories in submodules properly.

Attempt 3: Attempting to Reset Submodules

bash
git submodule deinit -f md-cookbook && git submodule update --init md-cookbook
Failed because: Still had untracked files preventing branch switch.

Attempt 4: Complex Merge with Unrelated Histories

bash
git merge development --no-ff --allow-unrelated-histories
Failed with multiple conflicts in submodules and files.

The "Aha!" Moment

We realized that:
  1. Each submodule needs to be handled independently first
  2. The monorepo's master branch should simply take all content from development
  3. The .gitmodules file needs to be updated to track master branches

Final Solution

1. For Each Submodule:

bash
# Example for md-cookbook
cd md-cookbook
git checkout master
git merge development --no-ff -m "docs: Enhance cookbook documentation and clarify project scope"
git push origin master
cd ..

# Repeat for other submodules (content, site_archive, etc.)

2. Update Monorepo Master:

bash
# Checkout master and take all development changes
git checkout master
git checkout development -- .
git add .
git commit -m "monorepo: Consolidate development changes into master"
git push origin master

3. Update Branch Tracking:

bash
# Update .gitmodules to track master branches
sed -i '' 's/branch = development/branch = master/g' .gitmodules
git add .gitmodules
git commit -m "config: Update submodules to track master branches"
git push origin master

Key Learnings

  1. Handle submodule merges first, independently in each submodule
  2. Don't try to merge the monorepo while submodules are in a mixed state
  3. Use git checkout --theirs or similar when you want to take all changes from one branch
  4. Remember to update .gitmodules to track the correct branches
  5. Push changes in both submodules and monorepo to maintain consistency

Best Practices for Next Time

  1. First merge and push all submodule changes to their respective master branches
  2. Then update the monorepo's master branch to take all development changes
  3. Finally update .gitmodules to ensure proper branch tracking
  4. Always push changes to maintain remote synchronization

Switching Back to Development

After merging to master, you'll often need to switch back to development. Here's how to do it in one command:
bash
# Switch monorepo and all submodules (except mdbook) back to development
sed -i '' 's/branch = master/branch = development/g' .gitmodules && \
git add .gitmodules && \
git commit -m "config: Update submodules to track development branches" && \
git checkout development && \
git submodule foreach 'if [ "$path" != "mdbook" ]; then git checkout development || true; fi'

Understanding the Command

This command performs several operations in sequence:
  1. sed -i '' 's/branch = master/branch = development/g' .gitmodules
    • Updates .gitmodules to track development branches
    • The -i '' flag makes changes in-place (empty quotes required on macOS)
  2. git add .gitmodules && git commit
    • Stages and commits the .gitmodules changes
    • Ensures branch tracking is properly recorded
  3. git checkout development
    • Switches the monorepo to its development branch
  4. git submodule foreach 'if [ "$path" != "mdbook" ]; then git checkout development || true; fi'
    • Runs a command in each submodule
    • The if condition excludes the mdbook submodule (external dependency)
    • || true ensures the command continues even if one submodule fails
    • Switches each submodule to its development branch

Safety Notes

  1. This command assumes all submodules (except mdbook) have a development branch
  2. The || true prevents the command from failing if any submodule is in an unexpected state
  3. Always commit or stash local changes before running this command
  4. Verify the state of critical submodules after switching