Understanding Git LFS and Managing Large Files in Git

Git Large File Storage (LFS) is a crucial tool for developers and teams dealing with large files in Git repositories. Managing these files efficiently can prevent repository bloat and performance issues. In this blog post, we explore how Git LFS works and how to effectively utilize it with practical commands.

What is Git LFS?

Git LFS extends Git by handling large files more efficiently. Instead of storing large files directly in the Git repository, which can slow down operations and increase repository size, Git LFS stores pointers in Git while storing the actual file content on a remote server.

Installing Git LFS

To begin using Git LFS, first install it on your system. For Ubuntu or Debian-based systems, use the following command:

Bash Code

sudo apt install git-lfs


After installation, initialize Git LFS in your repository with:

Bash Code

git lfs install


Tracking Large Files with Git LFS

Tracking large files involves specifying which files Git LFS should manage. Here’s how you track a specific file, for example, a PDF document:

Bash Code


git lfs track "books/How To Start Your Own Business - The Facts Visually Explained 2021.pdf"


This command tells Git LFS to manage the large PDF file located in the books directory.

Committing and Pushing Changes

Once you've tracked your large files, add them to your Git repository as usual:

Bash Code


git add .gitattributes

git add books/How To Start Your Own Business - The Facts Visually Explained 2021.pdf

git commit -m "Add large PDF file tracked by Git LFS"

git push


Managing and Cleaning Up Large Files

If you need to remove large files from your repository's history, Git provides tools like BFG Repo-Cleaner and git filter-repo. For instance, to remove a specific large file from the history, you can use BFG Repo-Cleaner:

Bash Code


bfg --delete-files "Best-Selling House Plans.epub"

git reflog expire --expire=now --all && git gc --prune=now --aggressive


Alternatively, git filter-repo can be used to filter and rewrite history to remove large files:

Bash Code

git filter-repo --path "books/Best-Selling House Plans.epub" --invert-paths


Conclusion

Git LFS simplifies the management of large files in Git repositories, improving performance and reducing repository size. By using Git LFS to track large files and tools like BFG Repo-Cleaner or git filter-repo to manage history, developers can maintain clean and efficient repositories.

Incorporating Git LFS into your workflow ensures smoother collaboration and better repository management, especially when dealing with large assets. Whether you're working on software projects or handling multimedia files, Git LFS is an essential tool for modern version control practices.