Overview
Supervised Learning: Given a dataset \(D=\{x_i, y_i\}_{i = 1}^N\), let \(\hat{\boldsymbol{y}} = \boldsymbol{f}(\boldsymbol{x};\theta)\), to minimize a loss function \(l(\boldsymbol{y}, \hat{\boldsymbol{y}})\)
History
1873 Alexander Bain: The information is in the connections.
不写了 :(