Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Could this data be anonymized and open-sourced for training diagnostic algorithms? It’s hard to put the genie back in the bottle so why not at least make some use of the images?


Possibly, though with only the images you'd be missing some useful info, like the actual outcome. Also they are likely not "high quality" images on average.. so for example, if there is cancer present, it may not be identified in the image.

See https://www.cancerimagingarchive.net/ for some examples of carefully curated data.


Is it possible? The metadata is easy to anonymize. Uniquely identifying features shown in the images (scars, etc)? Not without destroying them.

How much is the data worth for machine learning if you do not have access to the interpretation (and annotations) for the data? That is the hard part.

But. Is it ethical or even legal to do so without patient consent? No (at least not in my country).


In theory, yes. I was working on doing this (for internal data) at a large healthcare system some time ago.

The de-id part was actually really easy since DICOM is a very standardized format and this hospital system had good practices in place to only input certain information about each patient.


Does it need to be anonymized since it is now public? maybe just don't publish identifying information in your results




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: