Takeaways
So you've seen the clusters. Now what.
The shift in lens
Most podcast analytics ask what people listened to. Cohort asks which apps the downloads came through, and groups shows whose downloads distribute similarly across the 142-column app dimension.
Against Podcast Index topic categories the clusters scored ARI 0.016. Random. Against popularity bands and app mix, ARI 0.215. Real signal. Two shows can be very different topically and still get downloaded through the same app mix. Two shows in the same genre can have very different app distributions.
Three app clusters carry most of the listening
89% of observed downloads flow through three of the sixteen app clusters. They all look like "mainstream platforms" from the outside. The algorithm separated them because the show categories distributed through them differ enough to register as distinct groups.
For podcasters
If you run a podcast, the cluster picture is useful for a few things.
Which cluster your show lives in
Look at your show's download distribution across apps. The dominant apps place you in one of the clusters above. That cluster predicts how your downloads distribute more reliably than your show's topic does.
Concretely, a tech podcast and a true-crime podcast at the same download scale will probably share more cohort neighbors than a tech podcast and a different tech podcast at very different scales. Scale and app mix dominate.
Reading the lift signal
Within each app cluster, certain show categories over-perform or under-perform relative to the global average. The lift heatmap shows the cluster-by-category picture. Practical signals.
- Tech & entrepreneurship: dedicated podcast apps over-index ~4×. Marketing budget into Overcast, Pocket Casts, and podcast-app discovery tends to convert above baseline.
- True crime: walled-garden Apple+Spotify is the main download source. Dedicated podcast apps are under-indexed. Platform recommendation slots matter more than dedicated-app placements.
- Health & society: smart-speaker apps over-index ~3×. Content that reads well as ambient listening (single-host, slower pace) has a structural fit.
- News: dedicated podcast apps over-index. The walled-garden audience treats news as one genre among many rather than a focused listening habit.
Cross-promotion candidates
Cohort row clusters group shows whose downloads distribute through similar app mixes at similar scale. Your cluster mates reach audiences who use the same set of apps. Cross-promoting through those shows puts you in front of listeners who are already in the right app environment to find you.
Whether those listeners overlap directly with your existing audience isn't something the matrix can tell us, only that the platform reach overlaps. That's still useful. Many cross-promo programs are sold on platform-reach overlap, not literal listener overlap.
Beyond individual shows
Things that show up in the data that matter for platforms and tooling.
- Walled-garden dominance: 77% of downloads come through Apple Podcasts, Spotify, and CastBox. Platform UX and platform recommendation algorithms shape the majority of podcast listening, full stop.
- Engaged-listener cohort: the 12% on dedicated podcast apps over-indexes on tech, entrepreneurship, and news. Niche vertical podcasts can find their audience here in higher concentration than on platform-default apps.
- Smart-speaker listening: 6% of downloads come through smart-speaker and connected-device apps. Listening is ambient. Health and society content over-indexes here.
- Long tail: the remaining 5% includes language- and region-specific clusters (iVoox for Spanish, Xiao Yu Zhou for Chinese, Anghami for Arabic). These don't show up in aggregate platform metrics but they're meaningful inside their language groups.
What the data isn't
The aggregate-downloads-per-(show, app) format imposes real limits on what cohort can support.
Also see: analysis · algorithm reference · slide deck · the paper