Hacker Newsnew | past | comments | ask | show | jobs | submit | captaintobs's commentslogin

There are many teams using SQLMesh in production. Fivetran, Harness, Hopper, Pitchbook to name a few.

You can read some case studies here https://tobikodata.com/harness.html or join Slack to meet with folks to learn more about their experiences.


How does Fivetran use SQLMesh?


They're using it for data transformation.They're long time dbt users, but are switching to SQLMesh because it's extremely efficient, provides a better development experience, and can help them become warehouse agnostic.


Thanks! Yes, it's a much requested feature but it's difficult to get right!


Why is this faster than the stdlib? What does it do to achieve better performance?


It's in the readme of the github project.

> In short, the main reasons why MPIRE is faster are:

    When fork is available we can make use of copy-on-write shared objects, which reduces the need to copy objects that need to be shared over child processes

    Workers can hold state over multiple tasks. Therefore you can choose to load a big file or send resources over only once per worker

    Automatic task chunking


COW can come back and bite you by causing not easily predictable runtime.

Your code goes down a rarely used branch and suddenly a large object gets copied.


Isn’t this given “for free” by the fact that it’s fork, even in standard multiprocessing? What does the library do extra?


It doesn't do much extra I guess.

In standard multiprocessing, all arguments are pickled and pushed to a queue for processes in the pool to use.

To pass heavy arguments, the trick to using CoW was to place them as global variables before the map.

My understanding from Mpire is that they do the same thing, but expose a `shared_objects` parameter to make it less hacky than global variables.

I guess their benchmarks compare against pickling arguments, not against using global variables/CoW, which is why they boast performance increase.


Yea, I am a struggling to figure out what the secret sauce of this library and if that sauce is introducing foot guns down the line.

Multiprocessing std uses fork in linux distros already. I once ran a multiprocess code on Linux and Windows and there was a significant improvement in performance when running Linux.


They're deprecating fork in 1 or 2 versions, one of the main issues with it is copies locks across processes which can cause deadlocks.


Very cool work!


It's mostly affecting the amount of AI content I have to wade through these news aggregators...


There's a lot of hype with ruff, but I've been doing fine with black and autoflake. I have a pretty sizeable project and have never thought to myself it's problematic because it's slow.


We used both svb and first republic!


This is a garbage/sensationalist/click bait article. They source blind conversations.


Like almost everything on BI it seems.

It's just an awful, spammy clickbait-y site, all the way down. Almost by definition in violation of HN guidelines.


Monorepos at scale only really works if you have the tooling and infra to support it. Otherwise, you're going to be miserable with slow builds, pushes, pulls, impossible merge conflicts.


It's fun having the ritual of grinding coffee. Maybe I can't tell the difference but I enjoy it!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: