Skip to content

PickleScan - Bypass via bad CRC in archive

High
mmaitre314 published GHSA-mjqp-26hc-grxg Sep 8, 2025

Package

pip picklescan (pip)

Affected versions

<= 0.0.30

Patched versions

0.0.31

Description

Summary

PickleScan's ability to scan ZIP archives for malicious pickle files is compromised when the archive contains a file with a bad Cyclic Redundancy Check (CRC). Instead of attempting to scan the files within the archive, whatever the CRC is, PickleScan fails in error and returns no results. This allows attackers to potentially hide malicious pickle payloads within ZIP archives that PyTorch might still be able to load (as PyTorch often disables CRC checks).

Details

PickleScan likely utilizes Python's built-in zipfile module to handle ZIP archives. When zipfile encounters a file within an archive that has a mismatch between the declared CRC and the calculated CRC, it can raise an exception (e.g., BadZipFile or a related error). It appears that PickleScan does not try to scan the files whatever the CRC is.
This behavior contrasts with PyTorch's model loading capabilities, which in many cases might bypass CRC checks for ZIP archives - whatever the configuration is. This discrepancy creates a blind spot where a malicious model packaged in a ZIP with a bad CRC could be loaded by PyTorch while being completely missed by PickleScan.

PoC

  1. Download an existing Pytorch model with a bad CRC

wget <https://huggingface.co/jinaai/jina-embeddings-v2-base-en/resolve/main/pytorch_model.bin?download=true> -O pytorch_model.bin

  1. Attempt to scan the corrupted ZIP file with PickleScan:
# Assuming you have PickleScan installed and in your PATH
picklescan -p pytorch_model.bin

Screenshot 2025-06-29 at 13 52 07
Observed Result: PickleScan returns no results and presents an error message indicating a problem with the ZIP file, but it doesn’t attempt to scan any potentially valid pickle files within the archive.

Expected Result: PickleScan should either:

  • Attempt to extract and scan other valid files within the ZIP archive, even if some have CRC errors.
  • Report a warning indicating that the ZIP archive has CRC errors and might be incomplete or corrupted, but still attempt to scan any accessible content.

Impact

Severity: High
Affected Users: Any organization or individual using PickleScan to analyze PyTorch models or other files distributed as ZIP archives for malicious pickle content.
Impact Details: Attackers can craft malicious PyTorch models containing embedded pickle payloads, package them into ZIP archives, and intentionally introduce CRC errors. This would cause PickleScan to fail to analyze the archive, while PyTorch is still able to load the model (depending on its configuration regarding CRC checks). This creates a significant vulnerability where malicious code can be distributed and potentially executed without detection by PickleScan.
Ex: PickleScan on HuggingFace goes into error (https://huggingface.co/jinaai/jina-embeddings-v2-base-en/tree/main)
Screenshot 2025-06-29 at 13 55 58

Recommendations:
PickleScan should not fail on Bad CRC check, especially if Pytorch is not checking CRC.
Relaxed Zipfile is perfect to fix this issue:

--- picklescan/src/picklescan/relaxed_zipfile.py
+++ picklescan/src/picklescan/relaxed_zipfile.py
@@ class RelaxedZipFile(zipfile.ZipFile):
         try:
             # Skip the file header:
             fheader = zef_file.read(sizeFileHeader)
             if len(fheader) != sizeFileHeader:
                 raise zipfile.BadZipFile("Truncated file header")

             fheader = struct.unpack(structFileHeader, fheader)
             if fheader[_FH_SIGNATURE] != stringFileHeader:
                 raise zipfile.BadZipFile("Bad magic number for file header")

             zef_file.read(fheader[_FH_FILENAME_LENGTH])
             if fheader[_FH_EXTRA_FIELD_LENGTH]:
                 zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH])

-            return zipfile.ZipExtFile(zef_file, mode, zinfo, pwd, True)
+
+            # Create the ZipExtFile and disable CRC check
+            ext_file = zipfile.ZipExtFile(zef_file, mode, zinfo, pwd)
+            # Monkey-patch to skip CRC validation
+            ext_file._expected_crc = None
+            return ext_file

         except BaseException:
             zef_file.close()
             raise

Severity

High

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
Low
Privileges required
None
User interaction
None
Scope
Unchanged
Confidentiality
High
Integrity
None
Availability
None

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

CVE ID

No known CVE

Weaknesses

Protection Mechanism Failure

The product does not use or incorrectly uses a protection mechanism that provides sufficient defense against directed attacks against the product. Learn more on MITRE.