Paper To Read

[2306.11922] No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths
[2603.27432] The Geometric Cost of Normalization: Affine Bounds on the Bayesian Complexity of Neural Networks

DSB

[2410.19637] A distributional simplicity bias in the learning dynamics of transformers
Belrose: Neural networks learn statistics of increasing… - Google Scholar
- 当該論文を引用している論文調べ
[2603.12901] A theory of learning data statistics in diffusion models, from easy to hard
[2510.04285] Probing Geometry of Next Token Prediction Using Cumulant Expansion of the Softmax Entropy
[2602.12257] On the implicit regularization of Langevin dynamics with projected noise
Neural networks trained with SGD learn distributions of increasing complexity | OpenReview
SGD on Neural Networks Learns Functions of Increasing Complexity
[1805.08522] Deep learning generalizes because the parameter-function map is biased towards simple functions

SLT for Deep Learning

d murfet

https://arxiv.org/search/cs?searchtype=author&query=Murfet,+D

Search | arXiv e-print repository

Grocking

[2603.05228] The Geometric Inductive Bias of Grokking: Bypassing Phase Transitions via Architectural Topology

古典

他分類しずらいorとりあえず貯めてる

https://github.com/Hiroki11x/Papers これを参考にしたい．

読まなきゃと思ってる Paper List | Hiroki Naganuma