Binary searching with Array#bsearch
For tiny arrays, find is often perfectly fine. But once the data is sorted and starts to grow, Array#bsearch becomes a wonderfully efficient tool. Imagine a transit app that has a full day of departures, and wants the next train after 14:37:
departures = Array.new(24 * 12) do |i|
hour = i / 12
minute = (i % 12) * 5
"%02d:%02d" % [hour, minute]
end
# => ["00:00", "00:05", "00:10", ..., "23:55"]
desired_time = "14:37"
linear_checks = 0
next_departure = departures.find do |time|
linear_checks += 1
time >= desired_time
end
# => "14:40"
bsearch_checks = 0
next_departure = departures.bsearch do |time|
bsearch_checks += 1
time >= desired_time
end
# => "14:40"
linear_checks
# => 177
bsearch_checks
# => 8
const bsearch = (array, predicate) => {
let left = 0;
let right = array.length - 1;
let match = null;
while (left <= right) {
const middle = Math.floor((left + right) / 2);
if (predicate(array[middle])) {
match = array[middle];
right = middle - 1;
} else {
left = middle + 1;
}
}
return match;
};
const departures = Array.from({ length: 24 * 12 }, (_, i) => {
const hour = String(Math.floor(i / 12)).padStart(2, "0");
const minute = String((i % 12) * 5).padStart(2, "0");
return `${hour}:${minute}`;
});
const desiredTime = "14:37";
let linearChecks = 0;
const nextDepartureByFind = departures.find((time) => {
linearChecks += 1;
return time >= desiredTime;
});
// => "14:40"
let bsearchChecks = 0;
const nextDeparture = bsearch(departures, (time) => {
bsearchChecks += 1;
return time >= desiredTime;
});
// => "14:40"
console.log(linearChecks);
// => 177
console.log(bsearchChecks);
// => 8
In this example, because the times are zero-padded, lexicographical order matches chronological order, so bsearch works beautifully.
The broader point is that bsearch packs a lot of algorithmic leverage into a very small API. Instead of writing low/high pointer loops ourselves, we just describe the point where the answer flips from "too early" to "good enough".
One important caveat: bsearch assumes the array is already sorted, and the block needs to behave monotonically. In other words, once the block starts returning true, it should stay true for the rest of the array. If that is not the shape of the problem, find is likely the better tool.
There is also a second mode where the block returns a negative/zero/positive number, making bsearch behave more like a comparator-based search. I personally reach for the boolean form most of the time since I find it easier to read.
History
Array#bsearch was added in Ruby 2.0, released in 2013.
Its sibling Array#bsearch_index followed in Ruby 2.3, released on Christmas 2015.