Skip to content

Commit 51047a8

Browse files
committed
commands,t: simplify missing object push handling
The original version of the ensureFile() method of the uploadContext structure in our "commands" package was first introduced in commit 5d239d6 of PR git-lfs#176 in 2015, and has been refactored a number of times since then, but continues to be called for each object a push operation intends to upload. The method is documented as a function that checks whether a Git LFS object file exists in the local .git/lfs/objects storage directories for a given pointer, and if it does not, tries to replace the missing object file by passing the contents of the corresponding file in the working tree through our "clean" filter. The description of PR git-lfs#176 explains the function's purpose as follows, using the original pre-release name for the Git LFS project: If the .git/hawser/objects directory gets into a weird state (for example, if the user manually removed some files in there), this attempts to re-clean the objects based on the git repository file path. The code comments preceding the ensureFile() method also describe it in the same way, as do the notes in later PRs, such as PR git-lfs#2574, in which the "lfs.allowIncompletePush" configuration option was introduced, and PR git-lfs#3398, which refined the error message the Git LFS client reports during a push operation when an object is missing locally and is also not present on the remote server. However, the ensureFile() method has never actually replaced missing object files under any circumstances. It does check whether an object file is missing from the local storage directories, and if not, tests whether a file exists in the current working tree at the path of the Git LFS pointer associated with the object. If such a file exists, the method proceeds to run the Clean() method of the GitFilter structure in our "lfs" package on the file's contents. The Clean() method calculates the SHA-256 hash value of the file's contents and creates a Pointer structure containing this hash value, and also writes a copy of the file's data into a temporary file in the .git/lfs/tmp directory. It is then the responsibility of the caller to determine whether or not this temporary file should be moved into place in the .git/lfs/objects directory hierarchy. The only other caller of the Clean() method, besides the ensureFile() method, is the clean() function in the "commands" package, which is used by multiple Git LFS commands including the "git lfs clean" and "git lfs filter-process" plumbing commands, as well as the "git lfs migrate import" command. The clean() function performs several tasks after invoking the Clean() method. First, it checks whether the file processed by the method was found to contain a Git LFS pointer; if so, no further action is taken as we assume the file in the working tree has not been passed through our "smudge" filter, and we do not want to create another pointer which simply hashes and references the existing one. Next, the clean() function checks whether a file already exists in the local .git/lfs/objects storage directories at the location into which the function would otherwise expect to move the temporary file created by the Clean() method. If a file does exist in this location and has the same size as the temporary file, no further action is taken, as we assume it contains the same contents and does not need to be updated. Assuming neither of these checks causes the clean() function to return early, the function moves the temporary file created by the Clean() method into the expected location within the .git/lfs/objects directory hierarchy. Unfortunately, because the ensureFile() method invokes the Clean() method of the GitFilter structure but never performs any of the subsequent steps taken by the clean() function, it never recreates a missing Git LFS object from a file found in the working tree. This appears to have been the case at the time the ensureFile() method was introduced in PR git-lfs#176, and has remained so ever since. We could attempt to remedy this situation by altering the ensureFile() method so it calls the clean() function. To do so it would need to simulate the conditions under which the function usually runs, specifically within the "clean" filter context where the function is expected to transform an input data stream into an output data stream. We would likely use a Discard structure from the "io" package of the standard Go library to simply discard the output from the clean() function, as we do not need to send it back to Git in the way the "git lfs filter-process" or "git lfs clean" commands do. However, we would have to add logic to the clean() function to guard against the case where the file in the working tree had different contents than those of the missing Git LFS object. Because the user may check out a Git reference in which a different file exists at the same path in the working tree, or may simply modify the file in the working tree independently, there is no guarantee that the file we pass through the Clean() method is identical to the one from which the missing Git LFS object was created. The original implementation of the ensureFile() function, although it did not fulfil its stated purpose, did include a check to verify that the SHA hash of the working tree file, as returned by the Clean() method, matched that of the missing object. This check was removed in commit 338ab40 of PR git-lfs#1812, which would likely have introduced a serious bug, except that the ensureFile() method never actually replaced any missing objects and so the removal of this check had no functional impact. While we could try to revise the ensureFile() method to operate as was originally intended, the advantages of such a change are relatively slim, and the disadvantages are several. Most obviously, it requires modifications to our clean() function to guard against the replacement of object files with incorrect data, something the other callers of the function do not need to be concerned about. That this is a concern at all is in turn due to the reasonable chance that a file found in the current working tree at a given path does not contain the identical data as that of an Git LFS object generated from another file previously located at the same path. As well, the fact that the ensureFile() method has never worked as designed, despite being repeatedly refactored and enhanced over ten years, suggests that its purpose is somewhat obscure and that the requisite logic is less intelligible than would be ideal. Users and developers expect push operations to involve the transfer of data but not the creation (or re-creation) of local data files, so the use of some of our "clean" filter code in such a context is not particularly intuitive. For all these reasons, we just remove the ensureFile() method entirely, which simplifies our handling of missing objects during upload transfer operations. Instead, we check for the presence of each object file we intend to push in the uploadTransfer() method of our uploadContext structure, and if a file is not found in the local storage directories, we flag it as missing, unless the "lfs.allowIncompletePush" configuration option is set to "true". We also use the IsNotExist() function from the "os" package in the Go standard library to ascertain whether an object file is missing, or if some other type of error prevents us from reading its state. This mirrors the more detailed checks performed on each object file during a push operation by the partitionTransfers() method of the TransferQueue structure in the "tq" package. One consequence of this change is that when an object to be uploaded is missing locally and is also not already present on the remote server, and when the "lfs.allowIncompletePush" Git configuration option is set to its default value of "false", we now always abandon a push operation after the remote server indicates that it expects the client to upload the object. This means that in the "push reject missing object (lfs.allowincompletepush default)*" tests in our t/t-push-failures-local.sh test script, we should now expect to find a trace log message output by the client stating that the push operation's batch queue has been stopped because an object is missing on both the local system and the remote server. We added this trace log message in a prior commit in this PR, and were able to insert checks for it in several other tests in our test suite, but only because those tests either did not create a file in the working tree at all, as in the case of the "pre-push reject missing object" test in our t/t-pre-push.sh script, or removed the file in the working tree that corresponded to the object file they removed from the .git/lfs/objects storage directories. As a result of the changes in this commit, we can also now simplify the three tests that performed this extra setup step, where they used to remove the file in the working tree which corresponded to the object file they removed from the local Git LFS storage directories. This step is no longer necessary to cause the client to abandon the push operation after the server indicates that it requires an upload of an object the client has determined is missing from the local system. Therefore we can remove these extra setup steps from both of the "push reject missing object (lfs.allowincompletepush false)*" tests in the t/t-push-failures-local.sh script, and from the "pre-push reject missing object (lfs.allowincompletepush default)" test in the t/t-pre-push.sh script.
1 parent 6479308 commit 51047a8

File tree

3 files changed

+11
-48
lines changed

3 files changed

+11
-48
lines changed

commands/uploader.go

Lines changed: 6 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@ import (
44
"io"
55
"net/url"
66
"os"
7-
"path/filepath"
87
"strings"
98
"sync"
109

@@ -235,7 +234,7 @@ func (c *uploadContext) UploadPointers(q *tq.TransferQueue, unfiltered ...*lfs.W
235234
pointers := c.prepareUpload(unfiltered...)
236235
for _, p := range pointers {
237236
t, err := c.uploadTransfer(p)
238-
if err != nil && !errors.IsCleanPointerError(err) {
237+
if err != nil {
239238
ExitWithError(err)
240239
}
241240

@@ -345,9 +344,11 @@ func (c *uploadContext) uploadTransfer(p *lfs.WrappedPointer) (*tq.Transfer, err
345344

346345
// Skip the object if its corresponding file does not exist in
347346
// .git/lfs/objects/.
348-
if len(filename) > 0 {
349-
if missing, err = c.ensureFile(filename, localMediaPath, oid); err != nil && !errors.IsCleanPointerError(err) {
350-
return nil, err
347+
if _, err := os.Stat(localMediaPath); err != nil {
348+
if os.IsNotExist(err) {
349+
missing = !c.allowMissing
350+
} else {
351+
return nil, errors.Wrap(err, tr.Tr.Get("Error uploading file %s (%s)", filename, oid))
351352
}
352353
}
353354

@@ -360,37 +361,6 @@ func (c *uploadContext) uploadTransfer(p *lfs.WrappedPointer) (*tq.Transfer, err
360361
}, nil
361362
}
362363

363-
// ensureFile makes sure that the cleanPath exists before pushing it. If it
364-
// does not exist, it attempts to clean it by reading the file at smudgePath.
365-
func (c *uploadContext) ensureFile(smudgePath, cleanPath, oid string) (bool, error) {
366-
if _, err := os.Stat(cleanPath); err == nil {
367-
return false, nil
368-
}
369-
370-
localPath := filepath.Join(cfg.LocalWorkingDir(), smudgePath)
371-
file, err := os.Open(localPath)
372-
if err != nil {
373-
return !c.allowMissing, nil
374-
}
375-
376-
defer file.Close()
377-
378-
stat, err := file.Stat()
379-
if err != nil {
380-
return false, err
381-
}
382-
383-
cleaned, err := c.gitfilter.Clean(file, file.Name(), stat.Size(), nil)
384-
if cleaned != nil {
385-
cleaned.Teardown()
386-
}
387-
388-
if err != nil {
389-
return false, err
390-
}
391-
return false, nil
392-
}
393-
394364
// supportsLockingAPI returns whether or not a given url is known to support
395365
// the LFS locking API by whether or not its hostname is included in the list
396366
// above.

t/t-pre-push.sh

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -434,9 +434,6 @@ begin_test "pre-push reject missing object (lfs.allowincompletepush default)"
434434
git add present.dat missing.dat
435435
git commit -m "add objects"
436436

437-
git rm missing.dat
438-
git commit -m "remove missing"
439-
440437
delete_local_object "$missing_oid"
441438

442439
echo "refs/heads/main main refs/heads/main 0000000000000000000000000000000000000000" |

t/t-push-failures-local.sh

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -114,9 +114,6 @@ begin_test "push reject missing object (lfs.allowincompletepush false)"
114114
git add present.dat missing.dat
115115
git commit -m "add objects"
116116

117-
git rm missing.dat
118-
git commit -m "remove missing"
119-
120117
delete_local_object "$missing_oid"
121118

122119
git config lfs.allowincompletepush false
@@ -164,9 +161,6 @@ begin_test "push reject missing object (lfs.allowincompletepush false) (git-lfs-
164161
git add present.dat missing.dat
165162
git commit -m "add objects"
166163

167-
git rm missing.dat
168-
git commit -m "remove missing"
169-
170164
delete_local_object "$missing_oid"
171165

172166
git config lfs.allowincompletepush false
@@ -213,16 +207,17 @@ begin_test "push reject missing object (lfs.allowincompletepush default)"
213207

214208
delete_local_object "$missing_oid"
215209

216-
git push origin main 2>&1 | tee push.log
210+
GIT_TRACE=1 git push origin main 2>&1 | tee push.log
217211
if [ "1" -ne "${PIPESTATUS[0]}" ]; then
218212
echo >&2 "fatal: expected 'git push origin main' to fail ..."
219213
exit 1
220214
fi
221215

216+
grep "tq: stopping batched queue, object \"$missing_oid\" missing locally and on remote" push.log
222217
grep "LFS upload failed:" push.log
223218
grep " (missing) missing.dat ($missing_oid)" push.log
224219

225-
assert_server_object "$reponame" "$present_oid"
220+
refute_server_object "$reponame" "$present_oid"
226221
refute_server_object "$reponame" "$missing_oid"
227222
)
228223
end_test
@@ -265,10 +260,11 @@ begin_test "push reject missing object (lfs.allowincompletepush default) (git-lf
265260

266261
grep "pure SSH connection successful" push.log
267262

263+
grep "tq: stopping batched queue, object \"$missing_oid\" missing locally and on remote" push.log
268264
grep "LFS upload failed:" push.log
269265
grep " (missing) missing.dat ($missing_oid)" push.log
270266

271-
assert_remote_object "$reponame" "$present_oid" "${#present}"
267+
refute_remote_object "$reponame" "$present_oid"
272268
refute_remote_object "$reponame" "$missing_oid"
273269
)
274270
end_test

0 commit comments

Comments
 (0)