I’m restarting the mailbag!! Drop your questions in the comments below and I will post answers next week.
Also, please find me over on Bluesky: https://bsky.app/profile/annieduke.bsky.social
I’m restarting the mailbag!! Drop your questions in the comments below and I will post answers next week.
Also, please find me over on Bluesky: https://bsky.app/profile/annieduke.bsky.social
No posts
Collecting data is time-consuming and expensive. Value of Information, VoI, (en.wikipedia.org/wiki/Value_of_information) can give you an idea of whether to collect. If perfect information (VoPI) will not change your mind, then there is no point in collecting even imperfect information (lesswrong.com/posts/vADtvr9iDeYsCDfxd/value-of-information-four-examples). Often, it is easier to work up perfect information, so you would not want to spend more than the VoPI. Is this something you use in your work? Do you have other ideas on what and how much data to collect?
Base rates are still based on surveys, so part of data hygiene is using base rates but also doing data hygiene on those base rates. They should be based on the target population they are named for and have confidence intervals. A humorous, simple book for data hygiene is "How to Lie with Statistics" (goodreads.com/book/show/51291.How_to_Lie_with_Statistics) by Darrell Huff (effortmark.co.uk/avoid-how-to-lie-with-statistics/)