Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I see 'Github for data' but I'm reading services to hide PII. I think a distinction needs to be made here. Is the primary goal to enable Change Cata Capture on data(Github does CDC on code) or is the primary goal to manage PII?


Hey thanks for the great question. So the "Github for data" is referring to the ability to collaborate on data. By streaming in data, you can view discoveries we made on the data (entity recognition, etc) then essentially make a new version of that data with automatic transforms, anonymizations, etc. So you're absolutely right, managing PII is part of it, but really its about enabling entity A to share data with entity B with a high level of confidence the sensitive data is stripped out.

We'll be releasing some of the packages to do the analysis and transformations as time goes on, so stay tuned for those so you can take them for a test drive yourself.

Thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: