Shayan Srinivasa Garani,
[Tuesday, Thursday 11:30 am – 1:00 pm, DESE classroom]
Office Hours: Wednesday 4 – 5PM
Familiarity with digital signal processing and engineering mathematics at the undergraduate level.
Introduction, models of a neuron, neural networks as directed graphs, network architectures (feed-forward, feedback etc.), Learning processes, learning tasks, Perceptron, perceptron convergence theorem, relationship between perceptron and Bayes classifiers, batch perceptron algorithm, modeling through regression: linear, logistic for multiple classes, Multilayer perceptron (MLP), batch and online learning, derivation of the back propagation algorithm, XOR problem, Role of Hessian in online learning, annealing and optimal control of learning rate, Approximations of functions, universal approximation theorem, cross-validation, network pruning and complexity regularization, convolution networks, nonlinear filtering, Cover’s theorem and pattern separability, the interpolation problem, RBF networks, hybrid learning procedure for RBF networks, Kernel regression and relationship to RBFs., Support vector machines, optimal hyperplane for linear separability, optimal hyperplane for non-separable patterns, SVM as a kernel machine, design of SVMs, XOR problem revisited, robustness considerations for regression, representer theorem, introduction to regularization theory, Hadamard’s condition for well-posedness, Tikhonov regularization, regularization networks, generalized RBF networks, estimation of regularization parameter etc., L1 regularization basics, algorithms and extensions, Principal component analysis: Hebbian based PCA, Kernel based PCA, Kernel Hebbian algorithm, deep MLPs, deep auto-encoders, stacked denoising auto-encoders.
- S. Haykin, Neural Networks and Learning Machines, Pearson Press.
- K. Murphy, Machine Learning: A Probabilistic Perspective, MIT Press.
Final exam: 20%
- Homework #1 – Solutions
- Homework #2 – Solutions
- Homework #3 – Solutions
- Homework #4 – Solutions
- Homework #5 – Solutions
Papers to read:
- Moshe Leshno, Vladimir Ya. Lin, Allan Pinkus, and Shimon Schocken. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Networks, 6:861–867, 1993.
- G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems (MCSS), 2(4):303–314, December 1989.
Project Presentation: 3rd May, 08:30 to 11:00, DESE Classroom
Final exam: 30th April, 09:00 to 12:00, DESE Classroom
- Midterm 1: Thursday, 28th February, 11:30 – 13:00.
- Homework 2:
- Problem 1 – To be submitted in class on 26th February.
- Problem 2 and 3 – To be submitted in class on 5th March.
|Topics Covered||Lecture Notes|
|Week 1||The human brain; Introduction to Neural Networks; Models of a neuron; Feedback and network architectures; Knowledge representation; Prior information and invariances||NNLS_Week_01|
|Week 2||Learning processes; Perceptron; Batch perceptron algorithm; Perceptron and Bayes classifier||NNLS_Week_02|
|Week 3||Linear regression; Logistic regression ; Multi-layer perceptron||NNLS_Week_03|
|Week 4||Multi-layer perceptron; Back propagation; XOR problem||NNLS_Week_04|
|Week 5||Universal approximation function; Complexity Regularization and Cross validation; Convolutional Neural Networks (CNN); Cover’s Theorem||NNLS_Week_05|
|Week 6||Multivariate interpolation problem; Radial basis functions (RBF); Recursive least squares algorithm; Comparison of RBF with MLP; Kernel regression using RBFs; Kernel Functions; Basics of constrained optimization||NNLS_Week_06|
|Week 7||Optimization with equality constraint; Optimization with inequality constraint; Support Vector Machines (SVM); Optimal hyperplane for linearly separable patterns; Quadratic optimization for finding optimal hyperplane||NNLS_Week_07|
|Week 8||Optimal hyperplane for non-linearly separable patterns; Inner product kernel and Mercer’s theorem; Optimal design of an SVM; ε-insensitive loss function; XOR problem revisited using SVMs; Hilbert Space||NNLS_Week_08|
|Week 9||Reproducing Kernel Hilbert Space; Representer Theorem; Generalized applicability of the representer theorem; Regularization Theory; Euler-Lagrange Equation; Regularization Networks||NNLS_Week_09|
|Week 10||Generalized RBF networks; XOR problem revisited using RBF; Structural Risk Minimization; Bias-Variance Dilemma; Estimation of regularization parameters||NNLS_Week_10|
|Week 11||Basics of L1 regularization; Grafting; VC dimension; Autoencoders; Denoising Autoencoders||NNLS_Week_11|
|Week 12||Kernel PCA; Hebbian based maximum eigen filter||NNLS_Week_12|