from Hacker News

Does Sturgeon's Law Apply to Datasets?

by nthypes on 4/16/24, 10:56 PM with 0 comments

In various fields, we often encounter the principle of Sturgeon's Law, which suggests that "90% of everything is crud."

When it comes to datasets, how much of this holds true?

With the proliferation of LLM are we seeing an overwhelming amount of low-quality, irrelevant datasets?

Curious about HN thoughts on that.