Very nice library. It would be nice to add the ability to generate random values based on a distribution, e.g. the values are random but correspond to a normal distribution with a given mean and standard deviation. This would be very useful for testing statistical libraries and applications.
Which attributes would you sample from a normal distribution here? I don't see any numerical attributes where this would make sense. One could add weight, height, age etc. and sample from the relevant geographical/gender distributions?
An example would be the processing of automatically acquired measurement data. Such data usually shows some kind of distribution (binominal, normal, geometric, etc...) that needs to be taken into account when validating such a tool.