FWIW, I did a test a long time ago where I compared identical .223 loads using brass sorted by headstamp and weight (+_ 0.25gr) vs brass unsorted by weight and headstamps. The surprising thing was that the unsorted lot gave me the smallest groups. The test gun was an AR15 BTW. Maybe in bolt action guns it would be different. Being a bit anal though, I still sort by headstamp and weight but the test results made me wonder if it's really necessary for hunting type rifles.
Why would you accuracy test and not velocity test?
The brass does not contribute to group size, it contributes to velocity... Sure variations in velocity can cause group distribution, but so can wind and the shooters skill.
The purest measure between cases will appear on the chronograph... Assuming you have an accurate scale... Not some POS beam scale or even 1 decimal place digital scale. Otherwise you are just measuring the collective inaccuracy of your entire reloading and shooting ineptitude,