Instructor:
Shayan Srinivasa Garani,
[Tuesday, Thursday 11:30 am – 1:00 pm, DESE classroom]
Office Hours: Wednesday 4 – 5PM
Prerequisities:
Familiarity with digital signal processing and engineering mathematics at the undergraduate level.
Course Syllabus:
Introduction, models of a neuron, neural networks as directed graphs, network architectures (feedforward, feedback etc.), Learning processes, learning tasks, Perceptron, perceptron convergence theorem, relationship between perceptron and Bayes classifiers, batch perceptron algorithm, modeling through regression: linear, logistic for multiple classes, Multilayer perceptron (MLP), batch and online learning, derivation of the back propagation algorithm, XOR problem, Role of Hessian in online learning, annealing and optimal control of learning rate, Approximations of functions, universal approximation theorem, crossvalidation, network pruning and complexity regularization, convolution networks, nonlinear filtering, Cover’s theorem and pattern separability, the interpolation problem, RBF networks, hybrid learning procedure for RBF networks, Kernel regression and relationship to RBFs., Support vector machines, optimal hyperplane for linear separability, optimal hyperplane for nonseparable patterns, SVM as a kernel machine, design of SVMs, XOR problem revisited, robustness considerations for regression, representer theorem, introduction to regularization theory, Hadamard’s condition for wellposedness, Tikhonov regularization, regularization networks, generalized RBF networks, estimation of regularization parameter etc., L1 regularization basics, algorithms and extensions, Principal component analysis: Hebbian based PCA, Kernel based PCA, Kernel Hebbian algorithm, deep MLPs, deep autoencoders, stacked denoising autoencoders.
Reference Books:
 S. Haykin, Neural Networks and Learning Machines, Pearson Press.
 K. Murphy, Machine Learning: A Probabilistic Perspective, MIT Press.
Grading Policy:
Homeworks: 30%
Midterm: 20%
Project: 30%
Final exam: 20%
Homeworks:
 Homework #1 – Solutions
 Homework #2 – Solutions
 Homework #3 – Solutions
 Homework #4 – Solutions
 Homework #5 – Solutions
Exams:
Papers to read:
 Moshe Leshno, Vladimir Ya. Lin, Allan Pinkus, and Shimon Schocken. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Networks, 6:861–867, 1993.
 G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems (MCSS), 2(4):303–314, December 1989.
Announcements:

Project Presentation: 3rd May, 08:30 to 11:00, DESE Classroom

Final exam: 30th April, 09:00 to 12:00, DESE Classroom
 Midterm 1: Thursday, 28th February, 11:30 – 13:00.
 Homework 2:
 Problem 1 – To be submitted in class on 26th February.
 Problem 2 and 3 – To be submitted in class on 5th March.
Topics Covered  Lecture Notes  
Week 1  The human brain; Introduction to Neural Networks; Models of a neuron; Feedback and network architectures; Knowledge representation; Prior information and invariances  NNLS_Week_01 
Week 2  Learning processes; Perceptron; Batch perceptron algorithm; Perceptron and Bayes classifier  NNLS_Week_02 
Week 3  Linear regression; Logistic regression ; Multilayer perceptron  NNLS_Week_03 
Week 4  Multilayer perceptron; Back propagation; XOR problem  NNLS_Week_04 
Week 5  Universal approximation function; Complexity Regularization and Cross validation; Convolutional Neural Networks (CNN); Cover’s Theorem  NNLS_Week_05 
Week 6  Multivariate interpolation problem; Radial basis functions (RBF); Recursive least squares algorithm; Comparison of RBF with MLP; Kernel regression using RBFs; Kernel Functions; Basics of constrained optimization  NNLS_Week_06 
Week 7  Optimization with equality constraint; Optimization with inequality constraint; Support Vector Machines (SVM); Optimal hyperplane for linearly separable patterns; Quadratic optimization for finding optimal hyperplane  NNLS_Week_07 
Week 8  Optimal hyperplane for nonlinearly separable patterns; Inner product kernel and Mercer’s theorem; Optimal design of an SVM; εinsensitive loss function; XOR problem revisited using SVMs; Hilbert Space  NNLS_Week_08 
Week 9  Reproducing Kernel Hilbert Space; Representer Theorem; Generalized applicability of the representer theorem; Regularization Theory; EulerLagrange Equation; Regularization Networks  NNLS_Week_09 
Week 10  Generalized RBF networks; XOR problem revisited using RBF; Structural Risk Minimization; BiasVariance Dilemma; Estimation of regularization parameters  NNLS_Week_10 
Week 11  Basics of L1 regularization; Grafting; VC dimension; Autoencoders; Denoising Autoencoders  NNLS_Week_11 
Week 12  Kernel PCA; Hebbian based maximum eigen filter  NNLS_Week_12 