Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is not true - the break even period is closer to 6-7 months.


A single 8xA100 server is ~150k. On demand cost to rent it is $8.8/hour. Do the math and don't forget the energy costs.


I'd suggest finding a cheaper vendor if that is the lowest price you can get for an 8xA100 server. We spend a lot on both and colo our servers so I've definitely done the math!


Six months ago I've contacted 12 different vendors, the quotes for four 8xA100 servers ranged from 130k to 200k each. You probably wouldn't want to buy from the low end vendors.

Keep in mind, there are three important advantages of cloud:

1. You only pay for what you use (hourly). What is utilization of your on-prem servers?

2. You don't have to pay upfront - easier to ask for budget

3. You can upgrade your hardware easily as soon as new GPU models become available.


I know how much we paid and it is substantially less than what you were quoted - very likely from one of the 12 providers you contacted.

It is likely you just didn't realize how much margin these providers have and did not negotiate enough. How else do you think cloud providers are able to afford the rates they are giving? The way you describe it, places like Coreweave are operating as a charity. That isn't true - they just got better prices than you.

Our inference setup is 7 figures, has been running for a while (with new servers purchased frequently along the way) and there have been no issues - the cards, CPU, RAM, are all top of the line server hardware.

1. For inference (which is 80%+ of our need) our utilization is 100% 24/7/365. For stuff that is variable (like training) we often do use cloud - as I mentioned we do both.

2. I am the CEO so I am not sure who I'm asking for budget?

3. At this point we would have paid more for cloud than what we spent purchasing our own hardware. There is nothing stopping us from getting new hardware or cloud with newer cards while still getting to own our current hardware. In fact since our costs over the last year were lower due to us buying our own hardware it is actually easier for us to afford newer cards.


Yes, obviously cloud providers get their hardware at a fraction of a cost I'm quoted, they are ordering thousands of servers. I was only buying four. No one would negotiate with me, I tried. I suppose if I had a 7 digit budget I could get a better deal.

I was mainly talking about training workloads, inference is a different beast. I'm actually surprised you have 100% inference utilization - customer load typically scales dynamically, so with on-prem servers you would need to over-provision.

CEOs don't usually order hardware, they have IT people for that, with input from people like me (ML engineers) who could estimate the workloads, future needs, and specific hw requirements (e.g. GPU memory). And when your people come to you asking for budget, while you're trying to raise the next round, you're more likely to approve the 'no high upfront cost' option, right?

In my situation, when asked about buy vs rent my initial reaction was "definitely buy", but when I actually looked at the numbers, the 3 years break even period, no upfront costs for cloud, and no need to provision storage and networking, made it an easy recommendation. The cost of cloud GPUs has come down dramatically in the last couple of years.

Though I would like to have at least a couple of local GPU servers for quick experimentation/prototyping, because sometimes the overhead of spinning up a new instance and copying datasets is too great relative to the task.


> I suppose if I had a 7 digit budget I could get a better deal.

We got our "deal" when buying just a single server and have since bought many more with the same provider. We didn't spend 7 figures all at once, we did it piece-meal over time. There is nothing stopping you from getting much better prices.

> I'm actually surprised you have 100% inference utilization - customer load typically scales dynamically, so with on-prem servers you would need to over-provision.

It is pretty easy to achieve 100% inference utilization if you can find inference work that does not need to be done on-demand. We have a priority queue and the lower priority work gets done during periods with lower demand.

> CEOs don't usually order hardware, they have IT people for that, with input from people like me (ML engineers) who could estimate the workloads, future needs, and specific hw requirements (e.g. GPU memory).

Judging by this conversation it seems like "people like you" may not be the best people to answer this question since the best hardware quote you could get was at a >100% markup! At a startup that specializes in ML research and work the CEO is going to be intimately familiar with ML workloads, needs, and hardware requirements.

> And when your people come to you asking for budget, while you're trying to raise the next round, you're more likely to approve the 'no high upfront cost' option, right?

If the break even point is 6-7 months and our runway is longer than 6-7 months why would this matter?


the best hardware quote you could get was at a >100% markup!

Now I’m really curious - if you can share - how much did you pay, and when was it? Are you talking about 40GB or 80GB cards? How did you negotiate? Any attempts I made were shut down with simple “no, that’s our final price”. What’s the secret?

At a startup that specializes in ML research and work the CEO is going to be intimately familiar with ML workloads, needs, and hardware requirements.

I work at a startup which builds hardware accelerators, primarily for large NLP models. It’s a large part of my job is to be intimately familiar with ML workloads, needs, and hardware requirements. Our CEO definitely doesn’t have enough of that knowledge to choose the right hardware for our ML team. In fact even most people on our ML team don’t have deep up to date knowledge about GPUs, GPU servers, or GPU server clusters. I happen to know because I always had interest in hardware and I’ve been building GPU clusters since grad school.


As mentioned in another comment, the contract has very clear language not to share it - likely because they are offering different prices to different companies.

So I don't feel comfortable sharing any specifics, especially since this account is directly tied to my name.

With that being said, the negotiation process was pretty straightforward: - Emailed several vendors telling them we are a small startup, we are looking to make many purchases, but right now we are starting with one. We told everyone our purchasing decision was solely based on cost (given equivalent hardware) and to please put your best quote forward.

- Got back all of our prices. Went to the second cheapest one and told them they were beat and offered them the ability to go lower, which they did. We went with that vendor.

- For our next purchase, we went to the original lowest vendor (who got beat out), told them they lost out to price, and if they can go lower than that we would go with them and continue to give them business moving forward. They went quite a bit lower than what they originally offered, and what the vendor we first purchased from gave. We bought our second order from them and have used them ever since.


> We got our "deal" when buying just a single server and have since bought many more with the same provider. We didn't spend 7 figures all at once, we did it piece-meal over time. There is nothing stopping you from getting much better prices.

If it is as easy as you make it sound, why would you not just share the vendor name? I personally would love an 8xH100 machine for transformer experiments, but $100k+ pricing makes it a non-starter.


The contract has very clear language not to share it - likely because they are offering different prices to different companies.

(And as p1esk mentioned, there is no way you are getting H100s for under $100k).


8xH100 machine is ~300k I’ve heard.


Well, the person above claims 8xA100 significantly under $130k. I am curious to hear more.


Sure, but you mentioned H100 machine, and those are about 2.5x more expensive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: