| | Beyond Power Laws: Scaling Laws for Next-Token Prediction (francisbach.com) |
| 2 points by frozenseven 4 months ago | past |
|
| | The unreasonable effectiveness of Richardson extrapolation (francisbach.com) |
| 1 point by d_tr 9 months ago | past |
|
| | Learning Theory from First Principles (francisbach.com) |
| 3 points by rzk on Jan 6, 2025 | past |
|
| | Scaling Laws of Optimization (francisbach.com) |
| 11 points by matt_d on Oct 6, 2024 | past | 1 comment |
|
| | Revisiting the Classics: Jensen's Inequality (2023) (francisbach.com) |
| 89 points by cpp_frog on Aug 21, 2024 | past | 8 comments |
|
| | Unraveling spectral properties of kernel matrices (francisbach.com) |
| 1 point by lnyan on Jan 8, 2024 | past |
|
| | Revisiting the Classics: Jensen’s Inequality (francisbach.com) |
| 1 point by cpp_frog on March 14, 2023 | past |
|
| | Rethinking SGD’s Noise (francisbach.com) |
| 1 point by matt_d on July 25, 2022 | past |
|
| | Polynomial magic I: Chebyshev polynomials (francisbach.com) |
| 2 points by vector_spaces on Dec 23, 2021 | past |
|
| | The many faces of integration by parts – I: Abel transformation (francisbach.com) |
| 1 point by mariuz on Aug 5, 2020 | past |
|
| | Gradient descent for wide two-layer neural networks – I: Global convergence (francisbach.com) |
| 1 point by matt_d on June 1, 2020 | past |
|
| | Computer-Aided Analyses in Optimization (francisbach.com) |
| 5 points by matt_d on April 3, 2020 | past |
|
| | On the unreasonable effectiveness of Richardson extrapolation (francisbach.com) |
| 5 points by matt_d on March 1, 2020 | past |
|
| | The sum of a geometric series is all you need (francisbach.com) |
| 1 point by matt_d on Jan 6, 2020 | past |
|
| | Polynomial Magic II: Jacobi Polynomials (francisbach.com) |
| 2 points by matt_d on Dec 2, 2019 | past |
|
| | Polynomial magic I: Chebyshev polynomials (francisbach.com) |
| 3 points by matt_d on Nov 4, 2019 | past |
|