HANNAH: The belief underneath this request is just wrong, so let me start there. The instrument retailer wants to dump every URL the site can produce into the sitemap, thinking that listing a page there gets it indexed. A sitemap doesn’t index anything. It’s a suggestion about which URLs you consider worth discovering. Search engines still decide independently whether each page deserves indexing. Stuffing everything in doesn’t force indexing, it just claims you think all of these matter, and that claim is only as credible as the pages behind it.
ELENA: And the credibility is the whole thing, which is exactly why “everything” backfires. A sitemap is most useful as a clean signal of your canonical, valuable pages. Fill it with filtered views, internal search results, thin tag pages, parameter variants, discontinued products, and the signal degrades.
HANNAH: You’ve handed over a phone book instead of a map.
ELENA: Right, and now the crawler has to sort your valuable pages from your junk itself, which is the opposite of what the sitemap was for.
MARCUS: Before we pile on, though, there’s a real fear driving this, and “you’re wrong” doesn’t address it. The owner’s scared good pages aren’t getting found, and the everything-sitemap feels like insurance. For a big instrument catalog, that fear can be legitimate.
HANNAH: It can. The cure just isn’t burying the good pages among the bad.
MARCUS: Agreed, that’s my point. The answer to “are my good pages found” is a clean sitemap plus actual internal linking, not a comprehensive dump. So the instinct’s sound, the response is blunt. Don’t dismiss the worry, redirect it.
SOFIA: And the junk URLs shouldn’t be indexed for a reason that has nothing to do with the sitemap, which reinforces leaving them out. Picture the shopper who lands on an internal search-results page or an empty filtered view from Google, thin, often empty, duplicative. Bad arrival, and search engines notice bad arrivals. So pushing those toward indexing doesn’t just dilute a signal, it risks surfacing pages that disappoint the exact people the retailer wants. The pages that don’t belong in the sitemap are the ones that shouldn’t greet a searcher either.
NOAH: It’s the completeness reflex, the same one we keep hitting. More links, more keywords, more schema, now more URLs. The error’s identical every time, treating an inclusion count as the goal instead of the quality of what’s included. The tell is the word “every,” applied to a thing whose whole value is selectivity. A sitemap’s worth is in what it leaves out as much as what it includes.
THEO: So the rule flips coverage to curation. The sitemap holds only canonical, indexable, valuable URLs you genuinely want found, real product pages, meaningful category pages, actual content, each in canonical form, once. Exclude by design the filtered and sorted variants, internal search, thin tags, parameter URLs, non-canonical duplicates, dead pages. If a URL wouldn’t make a good search landing page, it’s not in the sitemap.
AIKO: And generate it from those rules, Theo, don’t hand-list and don’t dump, because this is a catalog that churns, instruments in and out of stock weekly. The sitemap should be produced automatically from pages meeting clear criteria, canonical, indexable, in-stock or evergreen, updating as the catalog moves, with an honest lastmod so the crawler knows what actually changed instead of recrawling everything. On a large catalog, split it into segments, products separate from categories separate from editorial, because then Search Console’s sitemap report shows submitted-versus-indexed per segment, and you can see exactly which type of page is getting dropped instead of staring at one total. An automatically curated, segmented sitemap stays a trustworthy map of the good pages even as thousands of product URLs churn beneath it.
HANNAH: So it comes back to the misread, a sitemap recommends discovery, it doesn’t manufacture indexing, and its power is entirely in being a selective, trustworthy signal. Marcus’s fear is valid, the cure is a clean sitemap plus real internal linking, not a complete one.
DANA: So where we land is curation, not coverage. We don’t put every URL in the sitemap, because it doesn’t force indexing, it signals which pages we consider worth discovering, and a signal full of filtered views, search results, thin tags, and dead pages is one we’ve made worthless. We include only canonical, indexable, genuinely valuable URLs, each in proper form, and we deliberately exclude the variants and thin pages that shouldn’t be landing pages anyway. We generate it from inclusion rules so it stays accurate as the catalog churns, and we pair it with solid internal linking, which is Marcus’s actual answer to the discovery fear driving this. The worry that good pages get missed is fair. The belief that cramming everything in fixes it, instead of burying the good among the junk, is the misread.
SOFIA: That turns the sitemap back into a map of your best pages, instead of a phone book the crawler has to sift.
DANA: A sitemap earns its keep by what it leaves out. List only the pages you’d be glad to see someone land on, and it becomes a signal worth trusting.