- Security
- A
Writing Your Own Crypto Engine for USB Drives: Secure Storage, Stream Encryption, and Fault Tolerance in Python
It all started with a simple task: securely transferring files on regular USB drives without cumbersome containers or complex user setups.
Introduction: Why not VeraCrypt?
I needed a solution that was "insert the flash drive -> enter the password -> files are encrypted." But the main requirement was data security even in the event of a power failure. If the flash drive is pulled out in the middle of encryption, the data should not turn into garbage.
Thus, crypto_engine was born. This is not an attempt to invent my own cryptography (we use standard AES-GCM and ChaCha20), but an engineering effort to securely manage keys in memory, handle gigabyte files without overflowing RAM, and ensure data integrity.
1. Memory Problem in Python and the SecureBytes Class
The biggest vulnerability of cryptographic utilities on managed languages (Python, Java) is memory management. When you store a password or key in a bytes variable, the garbage collector (GC) may copy this data to another memory location during garbage collection, leaving the original copies "hanging" in RAM for an indefinite period of time.
In my engine, I implemented the SecureBytes class, which solves this problem:
class SecureBytes:
def __init__(self, data: Union[bytes, bytearray, int]):
if isinstance(data, int):
self._buffer = bytearray(data)
else:
self._buffer = bytearray(data)
self._finalized = False
# Register a weak finalizer
self._weak_ref = weakref.ref(self, self._cleanup_callback)
def wipe(self, passes: int = 3):
if self._finalized or len(self._buffer) == 0:
return
# Pass 1: random data
self._buffer[:] = secrets.token_bytes(len(self._buffer))
# Pass 2: zeros
self._buffer[:] = b'\x00' * len(self._buffer)
self._finalized = True
gc.collect()
def __del__(self):
if not self._finalized:
self.wipe()
What is important here:
Usage of
bytearray: Unlike immutablebytes,bytearrayallows overwriting data at the same memory address.Multi-pass wiping: Before releasing memory, the buffer is overwritten with random data and then with zeros (according to NIST SP 800-88 recommendations).
Context manager: Keys are used only inside the
with secure_key(...):block, ensuring cleaning up even in case of exceptions.
2. Streaming encryption of large files
Encrypting a 10GB file on a USB drive with 4GB of RAM is a non-trivial task. Loading the entire file into RAM is not possible.
I implemented MemorySensitiveReader, which automatically switches between modes depending on file size and available memory:
class MemorySensitiveReader:
def __init__(self, file_path: str, memory_threshold: int = 100 * 1024 * 1024):
self.file_size = os.path.getsize(file_path)
# Threshold for switching to streaming mode
self.use_streaming = self.file_size > memory_threshold
def iter_chunks(self, chunk_size: int = 8192):
# Reading and encrypting in chunks
...Nonce problem in streaming encryption:
In AES-GCM and ChaCha20 modes, the same nonce (number for single-use) cannot be used for different blocks with the same key. This is a critical vulnerability. The solution in my code is the derivation of a unique nonce for each block based on the base nonce and the block index:
def _derive_block_nonce_12bit(base_nonce: bytes, block_index: int) -> bytes:
# First 8 bytes — prefix, last 4 — block counter
prefix = base_nonce[:8]
block_counter = block_index.to_bytes(4, byteorder='big')
return prefix + block_counterThis allows encrypting files of any size without violating cryptographic standards.
3. Fault tolerance: what if you pull out the USB drive?
The worst-case scenario for the user is data loss due to a failure during encryption. The standard approach “encrypt -> delete the original” does not work here.
I implemented a locking and rollback system:
Lock file (
.encryption_lock.json): Before starting the operation, a file is created where the statusin_progressand the list of already processed files are recorded.Temporary files: Encryption happens in the
.tmpfile. Only after a successful integrity check is the original deleted, and the temporary file is renamed.Integrity check: Before deleting the original, I decrypt the data block back and compare the HMAC and SHA-256 hashes. If they do not match — the original is not touched.
Recovery: If the process is interrupted, the utility sees the lock file and suggests rolling back the operation (
rollback_operation), decrypting the already processed files back.
4. Algorithms and performance
The engine supports three algorithms:
AES-256-GCM: Industry standard, hardware acceleration on most CPUs.
ChaCha20-Poly1305: Faster on devices without AES-NI (e.g., some ARM processors).
XChaCha20-Poly1305: Increased nonce (24 bytes), which reduces the risk of collisions with very large data volumes.
To speed up processing of many small files, parallel processing via ThreadPoolExecutor is implemented. However, due to the GIL in Python, the performance gain is more noticeable in I/O operations rather than pure encryption.
5. Interface and usage
Although the core is written in Python, a GUI is available for users, so they do not need to run scripts through the console.
6. Limitations and Threat Model
It is important to understand what this tool is suitable for and what it is not.
Metadata is not hidden: File names and folder structure are saved in
.usb_crypt_meta.json. An attacker with access to the flash drive will see the list of files but will not be able to open them. Hiding file names without creating a container is technically difficult and inconvenient for navigation.Protection from physical loss: The tool protects the data if you lose the flash drive. It does not protect against keyloggers on the computer where you enter the password.
Password policy: Built-in validation requires at least 12 characters, numbers, and special characters. Weak passwords are blocked at the code level.
Conclusion
Writing your own crypto-engine is always a balance between security and convenience. I focused on memory management security (which is rarely encountered in Python scripts) and fault tolerance of operations.
The project is open, and the code is available for auditing. If you find vulnerabilities or ways to optimize SecureBytes — feel free to open an issue.
Write comment