Block or noindex auto-generated pages from the start in the usual case, and let them index only in the rare case where each one delivers genuine unique value and real search demand. The default is block, because the typical auto-generated page is templated, thin, or combinatorial, exactly the kind of low-value page that creates bloat. The reason to decide this up front is simple: preventing bloat is far cheaper than cleaning it up after the search engine has indexed thousands of weak URLs and re-rated your site downward for it.
The pivot is value per page, judged honestly. Some generated pages do earn their place, a programmatic page that assembles genuinely useful, distinct data for a query people actually search, where each instance answers a real and different need. Those can be indexed, because each one stands on its own. But these are the exception, and the test is strict: not “could someone find this mildly relevant,” but “does this specific generated page add unique value and meet real demand.” If every page is the same template with a variable swapped in, it fails the test no matter how many you produce.
The “index everything and prune later” approach gets the economics backwards. Once weak pages are in the index, they have already diluted your quality signal and consumed crawl on URLs that should never have been live. Cleaning up means finding them, noindexing them, waiting for re-crawl, and waiting again for the site-wide re-assessment to register the improvement. Blocking them before they ever index skips all of that. Prevention is one decision made early, cleanup is a long campaign run late.
So set the rule before you generate the pages, not after. Block or noindex the templated, thin, and combinatorial output by default, and carve out indexing only for the specific generated pages that each clear the unique-value-and-demand bar. Decide it at the start, because the cheapest bloat is the bloat you never let into the index.