Digest hashing in Ruby: MD5, SHA256, and more

March 12, 2026 · 7 min read ·Updated May 29, 2026 ·intermediate

digesthashingmd5sha256cryptographysecurity

Digest hashing turns any chunk of data, a password, a file, or a JSON payload, into a fixed-length fingerprint. The same input always produces the same output, but there is no practical way to reverse it. Ruby’s standard library ships with this capability through the Digest module, which wraps a range of hash functions underneath.

You do not need any gems for this. require 'digest' is enough to start computing digests in plain Ruby.

Key takeaways

Use hexdigest when you want a readable hash string.
Use digest when you need raw bytes for binary protocols or lower-level integrations.
Use update or << when data arrives in chunks.
Use file when you need to hash a file without loading it all into memory.
Avoid MD5 and SHA1 for anything security-sensitive.

Those are the main choices most Ruby developers need. Once you understand them, the rest of the API is mostly about picking the right algorithm and feeding it data in the format your application already has.

Digest hashing in practice

Digest hashing shows up in a few common places. You might use it to compare a downloaded file against a known checksum, to generate a stable cache key from a payload, or to store a fingerprint of data that should not change between runs. Those are all good uses because they rely on the one thing a digest does well: the same input produces the same output.

It is important to separate digest hashing from password storage. A digest is fast, which is what makes it useful for checksums and fingerprints. Password storage needs the opposite tradeoff: a slow algorithm that makes brute-force attacks expensive. If you keep those roles separate, the rest of the API becomes much easier to reason about.

One-Shot Hashing

The quickest way to hash something is with the class method:

require 'digest'

Digest::SHA256.hexdigest("hello")
# => "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824"

The hex output is the most common format for display, logging, and comparison because it is readable and compact. Each byte becomes two hex characters, so a SHA256 digest produces exactly 64 characters of output every time.

hexdigest returns a human-readable hex string. If you need raw binary bytes for another system, use digest instead:

Digest::SHA256.digest("hello")
# => "\xaci\r\xba_\xb0\xa3\x0e&S\xb2\xac[~\x92\xe1\xb1a..."

For base64 encoding, which can be useful in HTTP headers or JSON payloads:

Digest::SHA256.base64digest("hello")
# => "rMhdrF+woD4m46K8W56S4xsYXsH6dCXF6zBDYimL6Jg="

Available Algorithms

The module provides one class per algorithm:

Digest::MD5.hexdigest("hello")      # "5d41402abc4b2a76b9719d911017c592" (insecure, do not use)
Digest::SHA1.hexdigest("hello")      # "aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d" (insecure, do not use)
Digest::SHA256.hexdigest("hello")   # the SHA256 example above
Digest::SHA384.hexdigest("hello")
Digest::SHA512.hexdigest("hello")
Digest::RMD160.hexdigest("hello")   # RIPEMD-160

The algorithm you pick should match your needs: SHA256 for general checksums and fingerprints, SHA512 when you need a longer digest output, and MD5 or SHA1 only when you are verifying legacy files that were published with those algorithms. Each algorithm is available as a subclass of Digest::Base, so they all share the same hexdigest, digest, and base64digest interface described above.

MD5 and SHA1 are cryptographically broken and should never be used for security-sensitive work. SHA256 is the minimum recommended for new code, and SHA512 is a fine choice when you want a longer digest string.

Incremental hashing with `update`

When your data arrives in chunks, such as from a streaming file read or a large payload, build the hash incrementally:

sha = Digest::SHA256.new
sha.update("hello")
sha.update(" ")
sha.update("world")
sha.hexdigest
# => "b3d9e23f4c7e2f1a8c6b5d4e9f0a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a"

Each call to update feeds more data into the hashing state without resetting it. That means the final digest is the same whether you feed all the data at once or spread it across many calls, as long as the order is preserved. The << operator is an alias for update, so you can chain more elegantly:

sha = Digest::SHA256.new
sha << "hello" << " " << "world"
sha.hexdigest

Chaining with << reads well when you are assembling a digest from several known pieces, such as concatenating a secret key with a message body. The chain form is purely syntactic sugar, but it often makes the code read more like a pipeline.

Once you are done, call reset to clear the state for a fresh round:

sha.reset
sha << "new data"
sha.hexdigest

Calling reset is important because the digest object is stateful. If you forget to reset before starting a new hash, the second digest will include data from the first one, which produces a completely different output. That bug can be hard to spot because both results look like valid hex strings.

Hashing files

Hashing a file without loading it entirely into memory uses file:

sha = Digest::SHA256.file("/path/to/large.zip")
sha.hexdigest

Or more concisely as a one-liner:

Digest::SHA256.file("/path/to/large.zip").hexdigest

file reads in chunks, so it works on arbitrarily large files without loading the whole file into memory at once. That makes it a good fit for checksums, download verification, and other tasks where the file itself is the input.

The same pattern is helpful anywhere you want to keep memory use predictable. You do not need to build your own loop over the file unless you want finer control over the read buffer or need to combine the hash with other processing.

Checking Equality with `==`

Compare a digest object directly against a string:

sha = Digest::SHA256.file("/etc/hosts")
sha == "a7b3c2d1e9f0..."  # false if mismatched

This is handy for password verification in a login flow:

def verify_password(input, stored_hash)
  Digest::SHA256.hexdigest(input) == stored_hash
end

For actual passwords, use a dedicated key derivation function like bcrypt or Argon2 instead of a plain hash. Those tools are designed to be slow, which makes brute-force attacks much more expensive.

Digest hashing and password storage

Digest hashing is a poor fit for password storage because it is fast by design. That speed is useful for checksums and fingerprints, but it also means an attacker can test a huge number of candidate passwords in a short time. A plain digest is not enough protection on its own.

Password hashing libraries solve that by adding salts and by making the work intentionally expensive. When a user signs in, the application checks the submitted password against the stored verifier, not against a plain digest. That extra cost is what makes modern password storage safer.

If you are building anything that accepts logins, treat Digest as a teaching tool or a checksum helper, not as the final answer for credential storage. That one decision keeps the rest of your design much cleaner.

Choosing an Algorithm

SHA256 is the default recommendation for most uses. It strikes a balance between speed and security for non-cryptographic applications like checksums and content-addressable storage.

SHA512 is preferred when you want a longer hash or are working in an environment where SHA256’s 32-byte output feels tight. SHA384 is essentially SHA512 truncated, which gives you a middle ground without changing families.

MD5 and SHA1 remain useful only for legacy compatibility, for example when you need to validate the checksum of an old download that was published with an MD5 hash. Never use them for new security work.

Frequently asked questions

Should I always use SHA256?

SHA256 is the default choice for most Ruby projects. It is widely supported, fast enough for normal checksum work, and much safer than MD5 or SHA1.

When should I choose SHA512?

Use SHA512 when you want a longer digest or when your policy prefers the SHA512 family. It is still a general-purpose hash, just with a larger output size.

Is `Digest` enough for passwords?

No. Passwords need a slow, salted hashing scheme such as bcrypt or Argon2.

Common pitfalls

Reusing a digest object without resetting. After calling hexdigest, the digest state still holds the accumulated data. Call reset before starting a new hash:

# Wrong — second call hashes nothing
sha = Digest::SHA256.new
sha << "hello"
sha.hexdigest     # correct
sha.hexdigest     # wrong — same result

# Right
sha.reset
sha << "world"
sha.hexdigest     # correct

Storing passwords as plain hashes. A SHA256 hash of a password can be cracked with a dictionary attack in minutes. Use bcrypt or Argon2 with built-in salting instead.

Is `digest` the same as `hexdigest`?

No. digest returns raw binary data, while hexdigest returns a hex string that is easier to read, print, and compare in logs.

Should I hash passwords with Digest?

No. Use bcrypt, Argon2, or another password hashing algorithm that is designed for slow verification and salting.

Can I hash a file without reading it all at once?

Yes. Digest::SHA256.file(path) reads in chunks, which keeps memory use low even for large files.