More

throwaway19423 · on Aug 28, 2024

This is the thing. I don't understand how the company is going to function once all those folks become multi-millionaires. It is a really odd situation .. I don't know if something quite like it occurred. Even G, M and F had slower slopes I think in terms of stock price. Nvidia wasn't known to have the best engineers (not disrespectful but don't think it was as hard to get in as the other tech companies 4-5 years back). I heard a story that at Microsoft back in the old days, people would wear a badge saying FUIV (FU I'm vested).

throwaway19423 · on Aug 7, 2024

You were a helpless, innocent kid once too :)

kelnos · on Aug 7, 2024

When I was a kid, going to a restaurant was a treat and a privilege. If my sister or I misbehaved, we were taken out to the car, and might not get to go to a restaurant again for a while.

I see kids in restaurants these days and mostly find their behavior appalling. And it's sad the best-behaved kids are only quiet because they have an iPad in front of them. (No headphones, of course, so that's another annoyance the rest of us have to put up with.)

astura · on Aug 7, 2024

When I was a kid the only restaurant I was allowed to go to was Friendly's, and only on my/my siblings birthdays. It was a HUGE treat and I knew to be on my very best behavior because if there was any acting up, even a little, I wouldn't be allowed out to eat out ever again.

Nowadays kids aren't expected to behave in a restaurant, so they don't. It's about expectations.

rangestransform · on Aug 7, 2024

my parents did not bring a 3 year old crying baby into an airport lounge

jorvi · on Aug 7, 2024

Its just common courtesy.

Same as minimizing the amount of flights you have with a baby.

People with a baby that take multiple flying trips a year are rude, bordering on douchey.

Just because you want an experience doesn't mean you get to ruin it for hundreds of others. Who not to mention paid for it. Height of egocentrism.

throwaway19423 · on Aug 7, 2024

I've heard about the lotto system but assumed the school district would be obligated to bus kids (i.e. not force them to use public transit alone). The parents have issues with the school bus??

greenavocado · on Aug 7, 2024

Why don't the kids take the school bus

ehvatum · on Aug 7, 2024

Some kind of lotto system

throwaway19423 · on Aug 7, 2024

I visited Seattle the city proper recently, and also felt depressed. Not sure why I got that vibe. On paper Seattle is great, and no-doubt, the Pacific north west has good nature and tech-industry. It felt odd why Seattle felt "different".

throw4847285 · on Aug 7, 2024

The Seattle Freeze seems like one of those broad stereotypes, but I experienced it as very real. People are not unfriendly, but they are unsocial. I felt lonelier than any other place I've lived. There are of course many other factors, but I'm not the only one, which is validating.

harshaw · on Aug 7, 2024

The weather? it's well documented. SAAD.

sph · on Aug 7, 2024

Seasonal Affective Disorder, i.e. lack of sun? Not sure what the other A stands for.

throwaway19423 · on April 18, 2024

I'm extremely miffed. Mulling leaving the country at some point in the near future, and that triggers a much higher tax bill (considered a deemed disposition). There is just no way to get ahead in this place.

throwaway19423 · on Feb 23, 2024

I did not invest in the stock market properly in the last decade (mid-40s; tech income but just put it to paying off house - estimating this cost be at least 5 million bucks conservatively). Opening a stock account and will slow down paying off the house.

throwaway19423 · on Feb 23, 2024

I am confused how all these things are able to interoperate. Are the creators of these models following the same IO for their models? Won't the tokenizer or token embedder be different? I am genuinely confused by how the same code works for so many different models.

brucethemoose2 · on Feb 23, 2024

It's complicated, but basically because most are llama architecture. Meta all but set the standard for open source llms when they released llama1, and anyone trying to deviate from it has run into trouble because the models don't work with the hyper optimized llama runtumes.

Also, there's a lot of magic going on behind the scenes with configs stored in gguf/huggingface format models, and the libraries that use them. There are different tokenizers, but they mostly follow the same standards.

null_point · on Feb 23, 2024

I found the magic! https://github.com/search?q=repo%3Aggerganov%2Fggml%20magic&...

null_point · on Feb 24, 2024

Hey, c'mon now. Just being playful about the "magic" string used in GGUF files to detect that it is in-fact a GGUF file.

throwaway19423 · on Feb 23, 2024

Can any kind soul explain the difference between GGUF, GGML and all the other model packaging I am seeing these days? Was used to pth and the thing tf uses. Is this all to support inference or quantization? Who manages these formats or are they brewing organically?

austinvhuang · on Feb 23, 2024

I think it's mostly an organic process arising from the ecosystem.

My personal way of understanding it is this - the original sin of model weight format complexity is that NNs are both data and computation.

Representing the computation as data is the hard part and that's where the simplicity falls apart. Do you embed the compute graph? If so, what do you do about different frameworks supporting overlapping but distinct operations. Do you need the artifact to make training reproducible? Well that's an even more complex computation that you have to serialize as data. And so on..

moffkalast · on Feb 23, 2024

It's all mostly just inference, though some train LoRAs directly on quantized models too.

GGML and GGUF are the same thing, GGUF is the new version that adds more data about the model so it's easy to support multiple architectures, and also includes prompt templates. These can run CPU only, be partially or fully offloaded to a GPU. With K quants, you can get anywhere from a 2 bit to an 8 bit GGUF.

GPTQ was the GPU-only optimized quantization method that was superseded by AWQ, which is roughly 2x faster and now by EXL2 which is even better. These are usually only 4 bit.

Safetensors and pytorch bin files are raw float16 model files, these are only really used for continued fine tuning.

Gracana · on Feb 23, 2024

> and also includes prompt templates

That sounds very convenient. What software makes use of the built-in prompt template?

moffkalast · on Feb 23, 2024

Of the ones I commonly use, I've only seen it read by text-generation-webui, in the GGML days it had a long hardcoded list of known models and which templates they use so they could be auto-selected (which was often wrong), but now it just grabs it from any model directly and sets it when it's loaded.

liuliu · on Feb 23, 2024

pth can include Python code (PyTorch code) for inference. TF includes the complete static graph.

GGUF is just weights, safetensors the same thing. GGUF doesn't need a JSON decoder for the format while safetensors needs that.

I personally think having a JSON decoder is not a big deal and make the format more amendable, given GGUF evolves too.

throwaway19423 · on Feb 9, 2024

Err .. the media people take visual quality and aesthetics very, very seriously. The Director has a vision and the tech goes to amazing lengths to support it. It is a different world as the original post said.

throwaway19423 · on Jan 18, 2024

I also worked at IBM (Research) early in my career. Most folks I know left but some good folks went into manager/leader positions. I always wonder .. do manager/leaders get paid better at IBM (Research)? Cause the people are really good.