Lazy Enumerators for Large Datasets

· 4 min read · Updated March 18, 2026 · intermediate
ruby enumerators performance lazy-evaluation

When working with large datasets in Ruby, loading everything into memory can become a bottleneck. Ruby provides Enumerator::Lazy to solve this problem — it lets you process data piece by piece, stopping as soon as you have what you need.

What Are Lazy Enumerators?

A lazy enumerator doesn’t process elements until you explicitly request them. Unlike regular enumerators that compute all values upfront, lazy enumerators build a pipeline of operations and execute them only when needed.

# Eager evaluation - processes everything immediately
(1..Float::INFINITY).map { |i| i * 2 }.first(5)
# => [2, 4, 6, 8, 10]
# Problem: Would hang forever trying to build infinite array

# Lazy evaluation - processes only what you ask for
(1..Float::INFINITY).lazy.map { |i| i * 2 }.first(5)
# => [2, 4, 6, 8, 10]
# Works fine - only computes the 5 values needed

The key difference: .lazy returns an Enumerator::Lazy instead of an Enumerator or Array.

Creating Lazy Enumerators

The simplest way is calling .lazy on any enumerable:

# From a range
(1..1000).lazy

# From an array
[1, 2, 3].lazy

# From a file (common use case)
File.open('large_file.txt').each_line.lazy

You can also use Enumerator::Lazy.new for custom lazy enumerators:

def filter_map(sequence)
  Enumerator::Lazy.new(sequence) do |yielder, *values|
    result = yield(*values)
    yielder << result if result
  end
end

filter_map(1..10) { |i| i * 2 if i.even? }.first(3)
# => [4, 8, 12]

Lazy Methods

All these Enumerable methods work lazily on a lazy enumerator:

MethodDescription
.mapTransforms each element
.select / .filterFilters elements by condition
.filter_mapFilter and transform in one pass
.flat_map / .collect_concatFlattens nested results
.take(n)Takes first n elements
.drop(n)Skips first n elements
.take_whileTakes elements until condition fails
.drop_whileDrops elements until condition fails
.grep(pattern)Filters matching a pattern
.chunkGroups consecutive elements

Forcing Evaluation

Lazy enumerators don’t execute until you force them. Use these methods to trigger evaluation:

lazy_enum = (1..).lazy.map { |i| i * 2 }

# force - returns an array of all elements
lazy_enum.force
# Warning: Will hang on infinite sequences!

# first(n) - returns first n elements (most common)
(1..).lazy.map { |i| i * 2 }.first(10)
# => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

# to_a - converts to array (same as force)
lazy_enum.take(5).to_a

# each - evaluates and iterates
lazy_enum.take(5).each { |i| puts i }

Converting Back to Eager

Sometimes you need a regular enumerator. Use .eager:

lazy_enum = (1..100).lazy.map { |i| i * 2 }
eager_enum = lazy_enum.eager
# Returns Enumerator, not Enumerator::Lazy

Practical Example: Processing a Large File

Lazy enumerators shine when processing files too big to fit in memory:

# Read file line by line, find first 10 matches
File.open('server.log', 'r')
  .each_line
  .lazy
  .grep(/ERROR/)
  .take(10)
  .each { |line| puts line }

# Process CSV without loading entire file
require 'csv'

CSV.foreach('massive.csv', lazy: true)
  .select { |row| row[3].to_i > 1000 }
  .map { |row| [row[0], row[3]] }
  .take(100)
  .to_a

Common Gotchas

1. Lazy Doesn’t Mean “No Computation”

Lazy still executes blocks for each element — it just delays when:

# This still calls the block 10 times, not 5
(1..).lazy.select { |i| i.even? }.take(5).each { |i| puts i }
# Output: 2, 4, 6, 8, 10

2. Performance Overhead

Enumerator::Lazy has significant overhead — it’s often 2-4x slower than eager evaluation for small datasets. Only use lazy when dealing with:

  • Infinite sequences
  • Large files
  • Remote API streams
  • Cases where you need only the first few results
# Don't use lazy for small arrays - just use regular map
[1, 2, 3].map { |i| i * 2 }  # Fast

# Use lazy for large/infinite data
(1..1_000_000).lazy.map { |i| i * 2 }.first(5)  # Efficient

3. Forgetting to Force

A lazy enumerator that never gets forced just builds a chain of promises:

result = (1..).lazy.map { |i| puts "computing #{i}"; i * 2 }
# Nothing printed yet - no evaluation happened

result.first(3)
# Now it evaluates: computing 1, computing 2, computing 3

4. Infinite Loops

Without limiting results, you can hang your program:

# Bad - will run forever
(1..).lazy.map { |i| i * 2 }.each { |i| puts i }

# Good - limits results
(1..).lazy.map { |i| i * 2 }.first(10).each { |i| puts i }

Performance Tip: filter_map

Ruby 2.7+ provides filter_map which combines filter and map in a single pass:

# Two passes (less efficient)
(1..10).lazy.select { |i| i.even? }.map { |i| i * 2 }.force

# One pass (more efficient)
(1..10).lazy.filter_map { |i| i * 2 if i.even? }.force
# => [4, 8, 12, 16, 20]

When to Use Lazy Enumerators

Use lazy when:

  • Processing files too large for memory
  • Working with infinite sequences
  • Only needing the first N results from a large dataset
  • Streaming data from external sources

Avoid lazy when:

  • Working with small, fixed-size data
  • You need all results anyway
  • Performance is critical for small datasets

See Also