Let x∈Rn, and let σ:R→R be function. A single-hidden-layer neural network is defined as a finite linear combinations of the form
Φ(x)=j=1∑Mαjσ(wj⊤x+bj),
where αj∈R are the output weights, wj∈Rn are the input weights, bj∈R are the biases y M∈N is the number of hidden units.
Then the set of all single-hidden-layer neural networks with M hidden units is
Nn(M)(σ)={Φ:Rn→RΦ(x)=j=1∑Mαjσ(wj⊤x+bj)}
Now the set of all single-hidden-layer neural networks with an arbitrary large hidden number is
Nn(σ)=m=1⋃∞Nn(m)(σ)