Hash#group_by
Overview
group_by is defined on Enumerable, so it works on hashes as well as arrays. When called on a hash, it iterates over each key-value pair, applies the block to derive a group key, and returns a hash where each group key maps to an array of all entries that produced that key.
The critical thing to understand: Hash#group_by on a hash returns a hash with arrays of [key, value] pairs as values, not a hash of hashes. If you want the latter, you need to transform the result.
Signature
group_by { |(key, value)| block } -> Hash
group_by -> Enumerator
Return Value
Returns a hash where:
- Keys are the return values of the block
- Values are arrays of
[key, value]pairs that produced each key
Basic Usage
Grouping by a simple condition
numbers = [1, 2, 3, 4, 5, 6]
numbers.group_by(&:odd?) # { true => [1, 3, 5], false => [2, 4, 6] }
Grouping a hash by value condition
users = {
alice: 28,
bob: 17,
carol: 35,
dave: 22
}
users.group_by { |_name, age| age >= 25 ? "adult" : "minor" }
# { "adult" => [[:alice, 28], [:carol, 35]], "minor" => [[:bob, 17], [:dave, 22]] }
Grouping by string length
words = ["apple", "apricot", "banana", "blueberry", "cherry"]
words.group_by(&:length)
# { 5 => ["apple"], 7 => ["apricot", "banana"], 9 => ["blueberry", "cherry"] }
Common Use Cases
Counting by category
orders = [
{ product: "Widget", status: "shipped" },
{ product: "Gadget", status: "pending" },
{ product: "Gizmo", status: "shipped" },
{ product: "Widget", status: "processing" }
]
orders.group_by { |o| o[:status] }.transform_values(&:length)
# { "shipped" => 2, "pending" => 1, "processing" => 1 }
Partitioning data with labels
temperatures = {
monday: 22,
tuesday: 19,
wednesday: 25,
thursday: 18,
friday: 27,
saturday: 21,
sunday: 23
}
temperatures.group_by { |_day, temp| temp > 22 ? "warm" : "mild" }
# { "mild" => [[:tuesday, 19], [:thursday, 18], [:saturday, 21]],
# "warm" => [[:monday, 22], [:wednesday, 25], [:friday, 27], [:sunday, 23]] }
Grouping by date component
events = {
event_a: Time.new(2024, 3, 15),
event_b: Time.new(2024, 3, 15),
event_c: Time.new(2024, 3, 22),
event_d: Time.new(2024, 3, 22),
event_e: Time.new(2024, 4, 1)
}
events.group_by { |_name, time| time.strftime("%Y-%m") }
# { "2024-03" => [[:event_a, ...], [:event_b, ...], [:event_c, ...], [:event_d, ...]],
# "2024-04" => [[:event_e, ...]] }
Grouping with transformation
products = {
sku_001: { name: "Widget", category: "A", price: 10 },
sku_002: { name: "Gadget", category: "B", price: 25 },
sku_003: { name: "Gizmo", category: "A", price: 15 },
sku_004: { name: "Doohickey", category: "B", price: 30 }
}
# Group by category, but keep only product names in each group
products.group_by { |_sku, attrs| attrs[:category] }
.transform_values { |entries| entries.map { |_k, v| v[:name] } }
# { "A" => ["Widget", "Gizmo"], "B" => ["Gadget", "Doohickey"] }
Result Structure
The return value is always a Hash, with arrays of two-element arrays as values:
h = { a: 1, b: 2, c: 3 }
result = h.group_by { |_k, v| v.odd? ? "odd" : "even" }
# { "odd" => [[:a, 1], [:c, 3]], "even" => [[:b, 2]] }
# Accessing the grouped data
result["odd"].to_h # => { a: 1, c: 3 } (convert back to hash)
Chaining with Other Enumerable Methods
# Group, then filter groups by size
data = [:a, :b, :c, :d, :e, :f]
data.group_by { |x| x.odd? ? "odd" : "even" }
.select { |_label, items| items.size > 2 }
# { "odd" => [:a, :c, :e] }
# Group, then sum values within each group
scores = { alice: 95, bob: 82, carol: 78, dave: 91 }
scores.group_by { |_name, score| (score / 10).to_s + "0s" }
.transform_values { |entries| entries.map { |_k, v| v }.sum }
# { "90s" => 186, "80s" => 82, "70s" => 78 }
Gotchas
Returns arrays of pairs, not hashes. When you iterate a hash with group_by, each group contains [key, value] pairs. You often need .to_h or .transform_values to get the structure you want:
# What you get (array of pairs)
h = { a: 1, b: 2 }
h.group_by { |k, v| v.odd? }
# { true => [[:a, 1]], false => [[:b, 2]] }
# What you might have wanted (hash of hashes or hash of sub-hashes)
h.group_by { |k, v| v.odd? }.transform_values(&:to_h)
# { true => { a: 1 }, false => { b: 2 } }
Block argument order. When iterating a hash, the block receives the key first, then the value:
h.each { |k, v| puts "#{k} => #{v}" } # a => 1, b => 2
Performance. group_by makes a pass over the entire collection. For very large datasets, consider whether you need the full grouped structure or can process in a single pass with each_with_object.
See Also
- /reference/enumerable/enumerable-group-by/ — the Enumerable version, works on arrays and hashes
- /reference/hash-methods/select/ — filter a hash by key or value conditions
- /guides/ruby-hash-tricks/ — practical patterns for transforming and grouping hashes