Hacker Newsnew | past | comments | ask | show | jobs | submit | quodlibetor's commentslogin

Mise does do local-only Make-similar task caching, if you specify sources and outputs: https://mise.jdx.dev/tasks/task-configuration.html#sources

If you specify sources but not "outputs" then mise will auto-track whether sources have been modified.

I requested the auto-track feature to speed up Docker builds a pretty long time ago, and it's been fantastic.


This is good to know! Seeing it say in the tool comparison that it doesn't support caching is a bit vague. I assumed that mean local caching too.

Ideally local and remote caching would be built on the same underlying code path.


ah I think I misinterpreted this to mean remote caching


yeah artifact caching is the obvious interpretation of caching when you're used to being compared to bazel, but the conversation was conflating "cache artifacts" and "cache should-run?" features.


> Not much polars can do about that in Rust

I'm ignorant about the exact situation in Polars, but it seems like this is the same problem that web frameworks have to handle to enable registering arbitrary functions, and they generally do it with a FromRequest trait and macros that implement it for functions of up to N arguments. I'm curious if there are were attempts that failed for something like FromDataframe to enable at least |c: Col<i32>("a"), c2: Col<f64>("b")| {...}

https://github.com/tokio-rs/axum/blob/86868de80e0b3716d9ef39...

https://github.com/tokio-rs/axum/blob/86868de80e0b3716d9ef39...


You'd still have problems.

1. There are no variadic functions so you need to take a tuple: `|(Col<i32>("a"), Col<f64>("b"))|`

2. Turbofish! `|(Col::<i32>("a"), Col::<f64>("b"))|`. This is already getting quite verbose.

3. This needs to be general over all expressions (such as `col("a").str.to_lowercase()`, `col("b") * 2`, etc), so while you could pass a type such as Col if it were IntoExpr, its conversion into an expression would immediately drop the generic type information because Expr doesn't store that (at least not in a generic parameter; the type of the underlying series is always discovered at runtime). So you can't really skip those `.i32()?` calls.

Polars definitely made the right choice here — if Expr had a generic parameter, then you couldn't store Expr of different output types in arrays because they wouldn't all have the same type. You'd have to use tuples, which would lead to abysmal ergonomics compared to a Vec (can't append or remove without a macro; need a macro to implement functions for tuples up to length N for some gargantuan N). In addition to the ergonomics, Rust’s monomorphization would make compile times absolutely explode if every combination of input Exprs’ dtypes required compiling a separate version of each function, such as `with_columns()`, which currently is only compiled separately for different container types.

The reason web frameworks can do this is because of `$( $ty: FromRequestParts<S> + Send, )*`. All of the tuple elements share the generic parameter `S`, which would not be the case in Polars — or, if it were, would make `map` too limited to be useful.


Thanks for the insight!


And the following criteria:

(a) The final category can never be lower than the highest hazard-based category;

(b) The TCSS should adequately reflect the case of high potential risk of two or more hazards. We consider a hazard of high risk when its respect- ive category is classified as 3 or higher (equal to the definition for a Major Hurricane on the SSHWS). Whenever (at least) two high risk haz- ards have the same category value and the third hazard has a lower category value, the final category should increment the highest hazard- based category. This implies that a TC scoring a Category 3 on both wind and storm surge, and a Category 1 on rainfall, will be classified as a Category 4.

(c) To warn the general public for an event with multiple extreme hazards, a high-risk TC can be classified as a Category 6 when either 1. at least two of the hazard-based categories are of Cat- egory 5; or 2. two categories are of Category 4, and one of Category 5.


> a few tenths of a percentage point of NYC

Feb 2024 (last year there's data, I think) was a record low and it was 1.4% empty, according to NYC[1].

But I don't really know the methodology, and according to other nyc gov data it's surprising, since we still haven't recovered our population from COVID[2].

The first statistic (housing pressure) is based on population growth, but the NYC population statistics suggest still meaningful population loss since 2020.

I have seen articles in the past that suggest that apartment vacancy rates in NYC are self-reported and misleading at best, but I don't really understand how that would work and I can't find any sources on that now.

It's also my understanding that some classes of landlords can mark empty apartments as income losses, basically or partially making up for the loss of revenue in tax rebates. But that's also not something I understand well, just something I have seen asserted.

[1]: https://www.nyc.gov/site/hpd/news/007-24/new-york-city-s-vac... [2]: https://s-media.nyc.gov/agencies/dcp/assets/files/pdf/data-t...


1.4% vacancy in a housing market is extraordinarily low. Remember: there is structurally always some material amount of vacancy, because people vacate housing units well before new people move into them. This, by the way, is a stat whose interpretation you can just look up. Real estate people use it as a benchmark.


Yeah I know it's among the lowest in the world, it's still an ~order of magnitude higher than a few tenths of a percent, which would be shocking for the reasons you mention.

My point though was just that I've seen arguments that these numbers can be manipulated, and the city's own data doesn't make sense by itself: either the 1.4% number is wrong or the slowly recovering population estimate is wrong. Especially considering the 60,000 housing units (representing 2% growth) created.


I was replying to this claim

> They leave it empty (that actually happens a lot, especially with foreign investors

Not talking about rental vacancy.


Vacancy doesn’t mean units held empty as either a parking place for cash or held off the market. Vacancy happens when you’re painting and repairing between rentals. Vacancy happens when there’s a renovation. Things like that are normal and not nefarious. Have 1.4% vacancy rate means there is essentially no usable housing for rent.

I was talking about the myth that there are tons of apartments held by rich people who don’t use them for anything.


My understanding is that vacancy means available units for rent. So, plausibly, if you say 50 of the 100 units in your building aren't available for rent because you say they're being painted then they don't contribute to the vacancy of your building.

That's almost the exact opposite of your definition, but I agree that a 1.4% vacancy rate means there's almost nothing available for rent.

I'm having trouble finding an official definition from a source that reports them, but my definition matches things that I can find online, eg https://www.brickunderground.com/rent/vacancy-rate-what-does...


Do you have any actual data on the rate of unoccupied properties that are not recently or soon to be available to rent in any major US markets? It seems like kind of hard data to find from my brief perusing around. I'm very interested in seeing some reliable data on this.

I had thought such units would have been included in the housing vacancy statistics, but apparently they are not.


I haven’t spent much time looking at any place other than New York. But there’s census data, tax data, and a lot of public records. The number of empty units is small. The total is probably close to 40k, but that’s a fuzzy number and moving target. That includes regular vacant units.

https://gothamist.com/news/how-many-nyc-apartments-are-vacan...


I recently wrote a similar tool focused more on optimizing the case of exploring millions or billions of objects when you know a few aspects of the path: https://github.com/quodlibetor/s3glob

It supports glob patterns like so, and will do smart filtering at every stage possible: */2025-0[45]-*/user*/*/object.txt

I haven't done real benchmarks, but it's parallel enough to hit s3 parallel request limits/file system open file limits when downloading.*


I have been chasing the gerrit code review high since I left a company that used it almost 5 years ago.

Stacked pull requests are usually what people point to to get this back, but this article points out that _just_ stacked pull requests don't handle it correctly. Specifically with github, you can't really see the differences in response to code review comments, you just get a new commit. Additionally, often github loses conversations on lines that have disappeared due to force pushes.

That said, I have a couple scripts that make it easier to to work with stacks of PRs (the git-*stack scripts in[1]) and a program git-instafix[2] that makes amending old commits less painful. I recently found ejoffe/spr[3] which seems like a tool that is similar to my scripts but much more pleasant for working with stacked PRs.

There's also spacedentist/spr[4] which gets _much_ closer to gerrit-style "treat each commit like a change and make it easier for people to review responses" with careful branch and commit management. Changes don't create new commits locally, they only create new commits in the PR that you're working on. It's, unfortunately, got many more rough edges than ejoffe/spr and is less maintained.

[1]: https://github.com/quodlibetor/dotfiles/tree/main/dot_local/... [2]: https://github.com/quodlibetor/git-instafix/ [3]: https://github.com/ejoffe/spr [4]: https://github.com/spacedentist/spr


I'm pretty sure that the scripts generated by inshellisense are CRLF, and the carriage returns aren't recognized by unix shells.

You should be able to fix it with:

    vi $HOME/.inshellisense/key-bindings.zsh -c "set ff=unix" -c ":wq"


The dependency list[0] looks pretty reasonable, AFAICT the overwhelming majority of that line-of-code count comes from autogenerated Windows API methods.

edit actual counts:

  $ cargo vendor && cd vendor && {
      for p in * ; do
      echo -n $p
      tokei $p | rg '^\s+Rust'
  done } | sort -n -k 4 | tabulate
  ----------------------------  ----  ---  ------  ------  ----  ----
  errno-dragonfly               Rust    2       9       8     0     1
  windows_aarch64_gnullvm       Rust    2      11       9     0     2
  windows_aarch64_msvc          Rust    2      11       9     0     2
  windows_i686_gnu              Rust    2      11       9     0     2
  windows_i686_msvc             Rust    2      11       9     0     2
  windows_x86_64_gnu            Rust    2      11       9     0     2
  windows_x86_64_gnullvm        Rust    2      11       9     0     2
  windows_x86_64_msvc           Rust    2      11       9     0     2
  winapi-i686-pc-windows-gnu    Rust    2      25      13    12     0
  winapi-x86_64-pc-windows-gnu  Rust    2      25      13    12     0
  windows-targets               Rust    1      54      46     3     5
  output_vt100                  Rust    2      67      55     0    12
  cfg-if                        Rust    2     164     131    16    17
  instant                       Rust    4     316     260     6    50
  countsctor                    Rust    2     331     254    21    56
  errno                         Rust    5     375     280    41    54
  diff                          Rust    4     561     485     9    67
  autocfg                       Rust    9     702     558    41   103
  yansi                         Rust    7     741     627     3   111
  signal-hook-registry          Rust    3     818     566   150   102
  fastrand                      Rust    4     830     710    16   104
  hermit-abi                    Rust    4     847     601     5   241
  pretty_assertions             Rust    5    1231    1072    33   126
  glob                          Rust    2    1589    1291   113   185
  bitflags                      Rust   20    1715    1373   105   237
  unicode-ident                 Rust   11    1794    1697    36    61
  signal-hook                   Rust   17    1969    1520   147   302
  tempfile                      Rust   15    2367    1928   102   337
  quote                         Rust   17    2458    1979   148   331
  redox_syscall                 Rust   23    3595    2996    83   516
  log                           Rust    9    3635    2970    97   568
  io-lifetimes                  Rust   15    4218    3605    80   533
  proc-macro2                   Rust   17    5286    4514   139   633
  cc                            Rust   13    5861    4767   488   606
  rustix                        Rust  236   39927   33837  1467  4623
  syn                           Rust   92   51956   48946   493  2517
  libc                          Rust  224  109836   99688  2073  8075
  linux-raw-sys                 Rust   61  145628  145455    84    89
  winapi                        Rust  405  179933  176630  3299     4
  windows-sys                   Rust  281  497624  497608     4    12
  ----------------------------  ----  ---  ------  ------  ----  ----
[0]: https://github.com/memorysafety/sudo-rs/blob/60985b2f5f7ffa8...


And this is what that small list of dependencies pulls in:

https://github.com/memorysafety/sudo-rs/blob/60985b2f5f7ffa8...


And most of those dependencies are only transitive via [dev-dependencies], which causes the entire Windows API to be pulled in.


Still doesn’t invalidate the fact that it’s a lot of surface area to hide a targeted attack.


If you're intending to install the package on an image, [dev-dependencies] are not going to be included in the package. So, no, it's not actually relevant to the surface area of the package.


It matter.

The great grandparent was talking about audit for supply chain attacks.

dev-dependencies can run on dev machine in compile time


If a C library used Python to run its tests, I don't think we would consider the whole Python interpreter to be part of the software supply chain for that library. Sure it's possible that running tests on a build machine could let an attacker corrupt the build later, with a bad PyPI package or something. But that feels more like a "not having a clean build environment" problem than a "this project has too many dependencies" problem. I think the fact that Cargo manages these two lists in the same file makes the relationship feel tighter, but I'm not sure it's actually tighter.


I see your point, but if you're going to consider all code that runs on the dev machine as a source of supply chain attacks, that's going to include all LOC for the Linux kernel. And the LOC for dev's web browser that they use to browse the issue tracker. And so on.


If you start doing a commit (via `c` in the magit status buffer, with the standard semantics of "you're going to commit everything that's currently staged") you can press capital F for an instant fixup, or capital S for instant squash.

When you press either of those, magit pops up a commit picker which shows the current git log. Selecting a commit will then instantaneously apply your staged changes to the selected commit. It's much simpler than any of the other workflows I've seen in response to your question.

The gif in this repo (for a tool I made that simulates this behavior as a cli tool for some jealous coworkers) tries to show the workflow: https://github.com/quodlibetor/git-fixup

That said, this _doesn't_ support the "automatically figure out which commits to apply hunks to" workflow. I personally find that I use both workflows depending on the nature of my changes.


This is a fantastic comment!

One small thing: we do now have tables[1]! At the moment they are ephemeral and only support inserts -- no update/delete. We will remove both of those limitations over time, though!

[1]: https://materialize.com/docs/sql/create-table/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: