Digest hashing in Ruby: MD5, SHA256, and more
Digest hashing turns any chunk of data, a password, a file, or a JSON payload, into a fixed-length fingerprint. The same input always produces the same output, but there is no practical way to reverse it. Ruby’s standard library ships with this capability through the Digest module, which wraps a range of hash functions underneath.
You do not need any gems for this. require 'digest' is enough to start computing digests in plain Ruby.
Key takeaways
- Use
hexdigestwhen you want a readable hash string. - Use
digestwhen you need raw bytes for binary protocols or lower-level integrations. - Use
updateor<<when data arrives in chunks. - Use
filewhen you need to hash a file without loading it all into memory. - Avoid MD5 and SHA1 for anything security-sensitive.
Those are the main choices most Ruby developers need. Once you understand them, the rest of the API is mostly about picking the right algorithm and feeding it data in the format your application already has.
Digest hashing in practice
Digest hashing shows up in a few common places. You might use it to compare a downloaded file against a known checksum, to generate a stable cache key from a payload, or to store a fingerprint of data that should not change between runs. Those are all good uses because they rely on the one thing a digest does well: the same input produces the same output.
It is important to separate digest hashing from password storage. A digest is fast, which is what makes it useful for checksums and fingerprints. Password storage needs the opposite tradeoff: a slow algorithm that makes brute-force attacks expensive. If you keep those roles separate, the rest of the API becomes much easier to reason about.
One-Shot Hashing
The quickest way to hash something is with the class method:
require 'digest'
Digest::SHA256.hexdigest("hello")
# => "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824"
The hex output is the most common format for display, logging, and comparison because it is readable and compact. Each byte becomes two hex characters, so a SHA256 digest produces exactly 64 characters of output every time.
hexdigest returns a human-readable hex string. If you need raw binary bytes for another system, use digest instead:
Digest::SHA256.digest("hello")
# => "\xaci\r\xba_\xb0\xa3\x0e&S\xb2\xac[~\x92\xe1\xb1a..."
For base64 encoding, which can be useful in HTTP headers or JSON payloads:
Digest::SHA256.base64digest("hello")
# => "rMhdrF+woD4m46K8W56S4xsYXsH6dCXF6zBDYimL6Jg="
Available Algorithms
The module provides one class per algorithm:
Digest::MD5.hexdigest("hello") # "5d41402abc4b2a76b9719d911017c592" (insecure, do not use)
Digest::SHA1.hexdigest("hello") # "aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d" (insecure, do not use)
Digest::SHA256.hexdigest("hello") # the SHA256 example above
Digest::SHA384.hexdigest("hello")
Digest::SHA512.hexdigest("hello")
Digest::RMD160.hexdigest("hello") # RIPEMD-160
The algorithm you pick should match your needs: SHA256 for general checksums and fingerprints, SHA512 when you need a longer digest output, and MD5 or SHA1 only when you are verifying legacy files that were published with those algorithms. Each algorithm is available as a subclass of Digest::Base, so they all share the same hexdigest, digest, and base64digest interface described above.
MD5 and SHA1 are cryptographically broken and should never be used for security-sensitive work. SHA256 is the minimum recommended for new code, and SHA512 is a fine choice when you want a longer digest string.
Incremental hashing with update
When your data arrives in chunks, such as from a streaming file read or a large payload, build the hash incrementally:
sha = Digest::SHA256.new
sha.update("hello")
sha.update(" ")
sha.update("world")
sha.hexdigest
# => "b3d9e23f4c7e2f1a8c6b5d4e9f0a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a"
Each call to update feeds more data into the hashing state without resetting it. That means the final digest is the same whether you feed all the data at once or spread it across many calls, as long as the order is preserved. The << operator is an alias for update, so you can chain more elegantly:
sha = Digest::SHA256.new
sha << "hello" << " " << "world"
sha.hexdigest
Chaining with << reads well when you are assembling a digest from several known pieces, such as concatenating a secret key with a message body. The chain form is purely syntactic sugar, but it often makes the code read more like a pipeline.
Once you are done, call reset to clear the state for a fresh round:
sha.reset
sha << "new data"
sha.hexdigest
Calling reset is important because the digest object is stateful. If you forget to reset before starting a new hash, the second digest will include data from the first one, which produces a completely different output. That bug can be hard to spot because both results look like valid hex strings.
Hashing files
Hashing a file without loading it entirely into memory uses file:
sha = Digest::SHA256.file("/path/to/large.zip")
sha.hexdigest
Or more concisely as a one-liner:
Digest::SHA256.file("/path/to/large.zip").hexdigest
file reads in chunks, so it works on arbitrarily large files without loading the whole file into memory at once. That makes it a good fit for checksums, download verification, and other tasks where the file itself is the input.
The same pattern is helpful anywhere you want to keep memory use predictable. You do not need to build your own loop over the file unless you want finer control over the read buffer or need to combine the hash with other processing.
Checking Equality with ==
Compare a digest object directly against a string:
sha = Digest::SHA256.file("/etc/hosts")
sha == "a7b3c2d1e9f0..." # false if mismatched
This is handy for password verification in a login flow:
def verify_password(input, stored_hash)
Digest::SHA256.hexdigest(input) == stored_hash
end
For actual passwords, use a dedicated key derivation function like bcrypt or Argon2 instead of a plain hash. Those tools are designed to be slow, which makes brute-force attacks much more expensive.
Digest hashing and password storage
Digest hashing is a poor fit for password storage because it is fast by design. That speed is useful for checksums and fingerprints, but it also means an attacker can test a huge number of candidate passwords in a short time. A plain digest is not enough protection on its own.
Password hashing libraries solve that by adding salts and by making the work intentionally expensive. When a user signs in, the application checks the submitted password against the stored verifier, not against a plain digest. That extra cost is what makes modern password storage safer.
If you are building anything that accepts logins, treat Digest as a teaching tool or a checksum helper, not as the final answer for credential storage. That one decision keeps the rest of your design much cleaner.
Choosing an Algorithm
SHA256 is the default recommendation for most uses. It strikes a balance between speed and security for non-cryptographic applications like checksums and content-addressable storage.
SHA512 is preferred when you want a longer hash or are working in an environment where SHA256’s 32-byte output feels tight. SHA384 is essentially SHA512 truncated, which gives you a middle ground without changing families.
MD5 and SHA1 remain useful only for legacy compatibility, for example when you need to validate the checksum of an old download that was published with an MD5 hash. Never use them for new security work.
Frequently asked questions
Should I always use SHA256?
SHA256 is the default choice for most Ruby projects. It is widely supported, fast enough for normal checksum work, and much safer than MD5 or SHA1.
When should I choose SHA512?
Use SHA512 when you want a longer digest or when your policy prefers the SHA512 family. It is still a general-purpose hash, just with a larger output size.
Is Digest enough for passwords?
No. Passwords need a slow, salted hashing scheme such as bcrypt or Argon2.
Common pitfalls
Reusing a digest object without resetting. After calling hexdigest, the digest state still holds the accumulated data. Call reset before starting a new hash:
# Wrong — second call hashes nothing
sha = Digest::SHA256.new
sha << "hello"
sha.hexdigest # correct
sha.hexdigest # wrong — same result
# Right
sha.reset
sha << "world"
sha.hexdigest # correct
Storing passwords as plain hashes. A SHA256 hash of a password can be cracked with a dictionary attack in minutes. Use bcrypt or Argon2 with built-in salting instead.
Is digest the same as hexdigest?
No. digest returns raw binary data, while hexdigest returns a hex string that is easier to read, print, and compare in logs.
Should I hash passwords with Digest?
No. Use bcrypt, Argon2, or another password hashing algorithm that is designed for slow verification and salting.
Can I hash a file without reading it all at once?
Yes. Digest::SHA256.file(path) reads in chunks, which keeps memory use low even for large files.
See Also
- ruby-string-manipulation — string transformation techniques useful when preparing data for hashing
- ruby-blocks-procs-lambdas — chaining operations with blocks and procs
- ruby-io-and-files — handling exceptions when working with files and digests
- Ruby Digest documentation — official API reference for digest classes