You can say the same thing about a 24GB consumer card. Going from being able to ...

paulmd · on Aug 10, 2023

is 70B not commercially useful because of the model storage requirements, or total inference performance, or additional memory per session that's inferencing, or what?

is the output better such that it's desirable, or is this just a case of "too much performance hit for a marginal gain"?

latchkey · on Aug 10, 2023

No one is arguing any of that. You're the one that brought up the 580 specifically.

By the way, still waiting for you to take me up on your 'bet'.

superkuh · on Aug 10, 2023

I was wrong. Sorry. Food trucks do accept cash most places.

Now it's your turn Mr. "You're not going to find rx580's with enough vram for AI. Typically 4-8gb." This is completely false. Rather than acknowledging that you then tried to move the goalposts (much like I did in that past thread saying, "Oh, but maybe it's just my region where they don't.") It looks like we both behave a bit silly when trying to save face when we're wrong.

latchkey · on Aug 10, 2023

> This is completely false.

It isn't completely false. You're doing super limited stuff as a hobbyist that barely works.

superkuh · on Aug 11, 2023

The parent article is entirely about running and benchmarking 4 bit quantized Llama2-7B/13B. This is the "super limited stuff as a hobbyist that barely works" and I've run them at entirely usable speeds on the AMD RX 580. You're either wrong or you didn't actually read the article and have been arguing correctly (from your ignorant perspective) about something random.

latchkey · on Aug 11, 2023

"entirely usable" is not the same as "roi efficient"

> from your ignorant perspective

no need for the ad hominem.

superkuh · on Aug 14, 2023

Ignorance is not an insult. It just became obvious that you were talking about a different concept (commercial use with big models) than the article itself and everyone else were talking about (7B/13B models). So I generously assumed you just hadn't read it (ignorance). I guess now that you've ignored that and doubled down I can assume you were/are just arguing in bad faith.