- [2306.11922] No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths
- [2603.27432] The Geometric Cost of Normalization: Affine Bounds on the Bayesian Complexity of Neural Networks
DSB
- [2410.19637] A distributional simplicity bias in the learning dynamics of transformers
Belrose: Neural networks learn statistics of increasing… - Google Scholar
- 当該論文を引用している論文調べ
- [2603.12901] A theory of learning data statistics in diffusion models, from easy to hard
- [2510.04285] Probing Geometry of Next Token Prediction Using Cumulant Expansion of the Softmax Entropy
- [2602.12257] On the implicit regularization of Langevin dynamics with projected noise
- Neural networks trained with SGD learn distributions of increasing complexity | OpenReview
- SGD on Neural Networks Learns Functions of Increasing Complexity
- [1805.08522] Deep learning generalizes because the parameter-function map is biased towards simple functions
SLT for Deep Learning
- You're Measuring Model Complexity Wrong — LessWrong
- [2410.02984] Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
- [2511.04564] Uncertainties in Physics-informed Inverse Problems: The Hidden Risk in Scientific AI
- [2406.10234] Review and Prospect of Algebraic Research in Equivalent Framework between Statistical Mechanics and Machine Learning Theory
Computational Complexity of Learning Neural Networks: Smoothness and Degeneracy
- Singular Learning Theory for Dummies — LessWrong
- Timaeus | Compressibility Measures Complexity: Minimum Description Length Meets Singular Learning Theory
- https://neurips.cc/virtual/2024/poster/95029
d murfet
古典
他分類しずらいorとりあえず貯めてる
- https://www.ai-gakkai.or.jp/jsai2017/webprogram/2017/pdf/733.pdf
- [2303.14151] Double Descent Demystified: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle
- [1801.05894] Deep Learning: An Introduction for Applied Mathematicians
- [2603.05228] The Geometric Inductive Bias of Grokking: Bypassing Phase Transitions via Architectural Topology
https://github.com/Hiroki11x/Papers これを参考にしたい.