Hacker Newsnew | past | comments | ask | show | jobs | submit | narvind's commentslogin

I like how you make the effort to go back and update your previous comments made in haste.


So Trifacta on top of Pandas. OK.


Yep. Some similarity to TFDV too, but the UI here looks to be more or less lifted directly from Trifacta/Cloud Dataprep.

pro: - Trifacta can be slow, and part of that might be the way it stores the data (I'm assuming js data structures); if so Pandas/Bamboolib could improve that.

con: - Trifacta/Cloud Dataprep is directly integrated with Cloud Dataflow and can handle jobs that would crash Pandas.


Thank you for pointing out TFDV (Tensorflow Data Validation) - I had not seen it so far.

And yes, as I say in the video, we used the Trifacta Wrangler Free Version to illustrate the vision of what we aspire to build. In the end, it will look different of course and we have some ideas on where we would imagine a completely different user interface. If this will be better or worse remains to be seen..

And thank you for the comparison of Trifacta and pandas. And I agree, that pandas won't be able to handle any dataset size. However, I wonder if the data set size can be increased if we also work in the cloud on machines with a larger RAM. Or, maybe even export Dask code instead of pandas code.

So, you seem to have experience working with Trifacta Wrangler. Is there something that you don't love about their solution?


It's slow, first and foremost; while I'm not 100% sure on the internals, I think that's because it's doing these operations on js data structures in browser, so pandas would be up to a few orders of magnitude faster out of the box.


Good to know, thank you for this!


Both form and content are superb. Very moving story. The crisis is real. We have to act now.

What are the best ideas proposed so far that might help them?


Should have been a video, not a slow to load click through light weight social media snapshots format.

This is a very serious humanitarian and economic problem and the glossy reportage format is too glib and simplistic IMO.

More importantly, what will help is an end to the western siege on their economy because they are a socialist country https://www.aljazeera.com/indepth/opinion/socialism-blame-ve...


Just to take away the creepiness factor, I'd say things like: Passengers who shared their favorite drinks with us rate us higher because we know them better and serve them accordingly.

This is akin to amazon's main website changing "We recommend the following to you" to: "People who bought this also bought" - to induce a subtle hint of jealousy that others are getting a better deal and this is totally optional.

Any recommendation system should be a voluntary opt-in feature and should consistently do a great job at recommending.


Mr. Nadella's version of "developers developers" is top notch. I was scared we will get something like: https://giphy.com/gifs/michael-strahan-enTimXqzmVXR6


OMG...This is amazing! They call the low dim compressed representations "State variables" and is very close to the ideas described here: https://arxiv.org/abs/1709.08568

Brilliant stuff!


Well said, Jeremy!


Fan mail induced high.


What are the odds, that the fiancee story is fake and he's diverting the world's attention again, now that he has served time?

Maybe he did receive the bitcoin...

That'd be the bonus point after touchdown.


fast.ai international fellowship recipient here - The second iteration of the fast.ai course is totally worth. If you can code in Python, you can master deep learning. Join our fantastic community!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: