Skip to content

Commit 6fa7936

Browse files
bk2204chrisd8088
andcommitted
git: improve sparse file support
When we invoke `git ls-files` to try to find all LFS files, we don't honour sparse file paths or exclusions. While we should never actually traverse excluded files, using the `--exclude-standard` option can avoid loading some data with filtered clones, which may result in less data being downloaded. In addition, we can honour sparse checkouts, since this code path is only used to handle the working tree and we know that the only files we need to consider are those Git actually put in the working tree. The `--sparse` option is new in 2.35, but we already require 2.42 above, so we can use it unconditionally. Co-authored-by: Chris Darroch <[email protected]>
1 parent 3990c7a commit 6fa7936

File tree

2 files changed

+61
-0
lines changed

2 files changed

+61
-0
lines changed

git/git.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -329,7 +329,9 @@ func LsFilesLFS() (*subprocess.BufferedCmd, error) {
329329
return gitNoLFSBuffered(
330330
"ls-files",
331331
"--cached",
332+
"--exclude-standard",
332333
"--full-name",
334+
"--sparse",
333335
"-z",
334336
"--format=%(objectmode) %(objecttype) %(objectname) %(objectsize)\t%(path)",
335337
":(top,attr:filter=lfs)",

t/t-pull.sh

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -392,3 +392,62 @@ begin_test "pull with empty file doesn't modify mtime"
392392
diff -u foo.mtime foo.mtime2
393393
)
394394
end_test
395+
396+
begin_test "pull with partial clone and sparse checkout and index"
397+
(
398+
set -e
399+
400+
# Only test with Git version 2.42.0 as it introduced support for the
401+
# "objecttype" format option to the "git ls-files" command, which our
402+
# code requires.
403+
ensure_git_version_isnt "$VERSION_LOWER" "2.42.0"
404+
405+
reponame="$(basename "$0" ".sh")"
406+
setup_remote_repo "$reponame"
407+
408+
clone_repo "$reponame" repo
409+
410+
git lfs track "*.dat"
411+
412+
contents1="a"
413+
contents1_oid=$(calc_oid "$contents1")
414+
contents2="b"
415+
contents2_oid=$(calc_oid "$contents2")
416+
contents3="c"
417+
contents3_oid=$(calc_oid "$contents3")
418+
419+
mkdir in out
420+
printf "%s" "$contents1" > a.dat
421+
printf "%s" "$contents2" > in/b.dat
422+
printf "%s" "$contents3" > out/c.dat
423+
git add .
424+
git commit -m "add files"
425+
426+
git push origin main
427+
428+
assert_server_object "$reponame" "$contents1_oid"
429+
assert_server_object "$reponame" "$contents2_oid"
430+
assert_server_object "$reponame" "$contents3_oid"
431+
432+
# Create a partial clone with a cone-mode sparse checkout of one directory
433+
# and a sparse index, which is important because otherwise the "git ls-files"
434+
# command ignores the --sparse option and lists all LFS files.
435+
cd ..
436+
git clone --filter=tree:0 --depth=1 --no-checkout \
437+
"$GITSERVER/$reponame" partial
438+
439+
cd partial
440+
git sparse-checkout init --cone --sparse-index
441+
git sparse-checkout set in
442+
git checkout main
443+
444+
assert_local_object "$contents1_oid" 1
445+
assert_local_object "$contents2_oid" 1
446+
refute_local_object "$contents3_oid"
447+
448+
git lfs pull 2>&1 | tee pull.log
449+
grep -q "Downloading LFS objects" pull.log && exit 1
450+
451+
refute_local_object "$contents3_oid"
452+
)
453+
end_test

0 commit comments

Comments
 (0)