jsenn's favorites | Hacker News

1.		Weight-sparse transformers have interpretable circuits [pdf] (cdn.openai.com)
		78 points by 0x79de 57 days ago \| 46 comments
2.		SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs (arxiv.org)
		34 points by PaulHoule 9 months ago \| 9 comments
3.		32k context length text embedding models (voyageai.com)
		101 points by fzliu on Nov 24, 2024 \| 32 comments
4.		Show HN: Llama 3.2 Interpretability with Sparse Autoencoders (github.com/paulpauls)
		579 points by PaulPauls on Nov 21, 2024 \| 99 comments
5.		Quantized Llama models with increased speed and a reduced memory footprint (meta.com)
		508 points by egnehots on Oct 24, 2024 \| 122 comments
6.		Detecting when LLMs are uncertain (thariq.io)
		283 points by trq_ on Oct 25, 2024 \| 165 comments
7.		Better RAG Results with Reciprocal Rank Fusion and Hybrid Search (assembled.com)
		249 points by johnjwang on May 30, 2024 \| 57 comments
8.		Vector indexing all of Wikipedia on a laptop (foojay.io)
		513 points by tjake on May 29, 2024 \| 140 comments
9.		Unprojecting text with ellipses (2016) (mzucker.github.io)
		151 points by nmstoker on May 19, 2024 \| 21 comments
10.		Making Sense of Acquire-Release Semantics (davekilian.com)
		159 points by sph on May 10, 2024 \| 69 comments
11.		Binary array set (nayuki.io)
		63 points by stereoabuse on March 26, 2024 \| 24 comments
12.		The Era of 1-bit LLMs: ternary parameters for cost-effective computing (arxiv.org)
		1040 points by fgfm on Feb 28, 2024 \| 447 comments
13.		The Continuity of Splines [video] (youtube.com)
		192 points by mcorcuera on Dec 16, 2022 \| 41 comments