Theme toggle
DEC 18 2025

Coffee time

Yes, I believe it was interesting mathematics, but nothing more than "cute"

The activation function is just a smoothed threshold. Sigmoid, ReLU, step function - all variations of "below threshold = off, above threshold = on."

What made neural networks wait 80 years to work?

The math was there in the 1940s. What changed:

  • Scale (more neurons, more layers)
  • Backpropagation (efficient gradient computation)
  • Data (internet gave us training examples)
  • Hardware (GPUs for parallel matrix multiplication)
Read more

Built with KaTeX and principles of Swiss design