Skip to main content

Introduction

Let xRnx \in \mathbb{R}^n, and let σ:RR\sigma: \mathbb{R} \to \mathbb{R} be function. A single-hidden-layer neural network is defined as a finite linear combinations of the form

Φ(x)=j=1Mαjσ(wjx+bj),\Phi(x) = \sum_{j=1}^{M} \alpha_j \, \sigma(w_j^\top x + b_j),

where αjR\alpha_j \in \mathbb{R} are the output weights, wjRnw_j \in \mathbb{R}^n are the input weights, bjRb_j \in \mathbb{R} are the biases y MNM \in \mathbb{N} is the number of hidden units.

Then the set of all single-hidden-layer neural networks with MM hidden units is

Nn(M)(σ)={Φ:RnR|Φ(x)=j=1Mαjσ(wjx+bj)} \mathcal{N}^{(M)}_n(\sigma) = \left\{ \Phi: \mathbb{R}^n \to \mathbb{R} \,\middle|\, \Phi(x) = \sum_{j=1}^{M} \alpha_j \, \sigma(w_j^\top x + b_j)\right\}

Now the set of all single-hidden-layer neural networks with an arbitrary large hidden number is

Nn(σ)=m=1Nn(m)(σ)\mathcal{N}_n(\sigma)=\bigcup_{m=1}^{\infty}\mathcal{N}^{(m)}_n(\sigma)