Floyd's Sampling Algorithm • Buttondown
I love sampling algorithms. Here's the sampling algorithm that I find most magical. We want to generate a subset of {1, 2, ..., n} of size k. def floyd(n,...
April 6, 2026
Floyd's Sampling Algorithm #

I love sampling algorithms. Here's the sampling algorithm that I find most magical. We want to generate a subset of {1, 2, ..., n} of size k.
def floyd(n, k):
s = set()
for i in range(n - k + 1, n + 1):
t = random.randint(1, i)
if t in s:
s.add(i)
else:
s.add(t)
return s
I learned about this algorithm the canonical way all good algorithm lore is imparted; one of my old coworkers wrote this comment ↗ in some code I was looking at once:
// Use Robert Floyd's algorithm to generate numSplits distinct integers
// between 0 and len(splitKeys), just because it's so cool!
chosen := make(map[int]struct{})
for j := len(splitKeys) - numSplits; j < len(splitKeys); j++ {
t := fit.rng.Intn(j + 1)
if , alreadyChosen := chosen[t]; !alreadyChosen {
// Insert T.
chosen[t] = struct{}{}
} else {
// Insert J.
chosen[j] = struct{}{}
}
}
Unlike, say, Reservoir Sampling or something, this algorithm sort of resists an easy intuition to me. The branch makes it weird and makes a normal combinatorial, or visual argument tricky. In fact, I find even a casual intuition hard to grasp. Here are two possible ways to think about what makes it work.
First Perspective #
Here's one way to think about it: each iteration of the algorithm goes from a uniformly distributed k-subset of [1..n] and an element of [1..n+1] to a uniformly distributed k+1-subset of [1..n+1]. If we can show that the map from those k-subsets and that sample is k+1-to-1, then we've shown that the resulting set is also uniformly distributed.
So, we need to show that each possible k+1-subset S of [1..n+1] has exactly k+1 inputs that map to it:
- sets containing
n+1come from thek-subset withoutn+1, and any of the elements ofS, and - sets not containing
n+1come from starting with one of itsk-subsets and adding the missing element.
This is not so bad, I think it basically makes sense to me. But I also learned of another way of thinking about this algorithm while trying to write this post.
Second Perspective #
This one comes from Paul ↗, and it notes a connection between this algorithm and the Fisher-Yates shuffle for permuting an array uniformly at random. Specifically, the "upward" variation that looks like this:
def shuffle(a):
for i in range(0, len(a)):
j = random.randint(0, i)
a[i], a[j] = a[j], a[i]
This builds up a permutation incrementally. After each step, we have a permutation on [0..i], then we augment that and turn it into a permutation on [0..i+1].
Imagine plucking off the last k elements of an array we just shuffled, then that's a uniformly random sample of k elements. The observation here is that Floyd's algorithm is just doing the last k swaps here, directly.
If you call the last k elements of the array the "tail," then for each element i in the tail originally, either we:
- swap
iwithjthat's in the tail, in which caseiwill stay in the tail forever and be part of the sample, or - swap it with
jnot in the tail, thenjwill live in the tail forever and be part of the sample.
Don't miss what's next. Subscribe to NULL BITMAP by Justin Jaffray:
Email address (required) Subscribe
#### Add a comment:
Comment and Subscribe
Canonical source
https://docs.platphormnews.com/docs/floyd-and-number-x27s-sampling-algorithm-buttondown-1943
Source domain
buttondown.com
Publisher
buttondown.com
License / usage
Unknown. Review the original source terms before republishing beyond public-safe excerpts.
Overall quality score, confidence 81%
21 sentences, 3 headings, 4 list items.
Keep source attribution visible in the rendered document.
Related Documentation
Chert | iMessage Infrastructure for Reaching People at Scale
Skip to main content https://docs.platphormnews.com/docs/chert imessage infrastructure for reaching people at scale main content Back to docs https://docs.platphormnews.com/docs Skip to content ↗ https://www.trychert.com
7 min read
SEO Starter Guide: The Basics | Google Search Central | Documentation | Google for Developers
Skip to main content https://developers.google.com/search/docs/fundamentals/seo starter guide main content Google Search Central English Deutsch Español Español – América Latina Français Indonesia Italiano Polski Portugu
22 min read
Chert | iMessage Infrastructure for Reaching People at Scale
Skip to content https://www.trychert.com/ main content New Chert is now live on Hacker News Check it out → https://www.trychert.com/agent Chert https://www.trychert.com/ Home https://www.trychert.com/ Blog https://www.tr
5 min read
Three Inverse Laws of AI - Susam Pal
9 min read
GameStop Proposes to Acquire eBay at $125.00 Per Share | GameStop Corp.
GameStop Corp. (NYSE: GME) today submitted a non-binding proposal to acquire 100% of eBay Inc. (NASDAQ: EBAY) at $125.00 per share in cash and stock. The offer represents a 46% premium to eBay’s unaffected closing price on February 4, 2026, the day GameStop started accumulating its position in eBay. GameStop has built a 5% economic stake in eBay through derivatives and beneficial ownership of common stock. GameStop is filing a Schedule 13D and HSR notification tomorrow. The full proposal letter and accompanying materials are available at investor.gamestop.com/ebay . The proposed offer is $125.00 per share, comprising 50% cash and 50% GameStop common stock, with full shareholder election rights as to consideration type and pro-rata allocation. Aggregate undiluted equity value is approximately $55.5 billion, based on eBay’s most recently disclosed undiluted share count, representing a 27% premium to the 30-day VWAP and a 36% premium to the 90-day VWAP. The transaction is conditioned on
11 min read