Problem 1. Basic Optimization, MLE for IID Poisson Data

Suppose \(y_i\) is a count then a very common model is to assume the Poisson disttribuion: \[ P(Y=y \;|\; \lambda) = \frac{e^{-\lambda} \, \lambda^y}{y!}, \; y = 0,1,2,\ldots \]

Given \(Y_i \sim Poisson(\lambda)\) iid, (that is, \(Y_i = y_i\)), what is the MLE of \(\lambda\)?

Problem 2. Constrained Optimization

Let \[ f(x) = (x_1 - a_1)^2 + (x_2 - a_2)^2, \;\; g(x_1,x_2) = x_1^2 + x_2^2 - 1. \]

Minimize \(f(x)\) subject to the constraint that \(g(x) \leq 0\).

First draw simple pictures to make the solution obvious.

Then check that the lagrange multiplier first order condition conforms with with your intution.

How does the norm of \((a_1,a_2)\) affect the solution !!??

Problem 3. Polynomial Regression

A basic idea in nonlinear regression is to use polynomial terms.

With one \(x\) variable, this means we consider the models: \[ Y_i = \beta_0 + \beta_1 x_i + \beta_2 x_i^2 + \ldots + \beta_p x_i^p + \epsilon_i \]

Using the simple used cars data (with \(n\)=1,000) with Y= price and x=mileage, find the best choice of \(p\).

Fit your chosen polynomial mode using all the data and plot the fit on top of the data. Do you like it? Also plot the fits for a \(p\) that is “way to big”. Whais wrong with it?

Problem 4. Regularized Regression

Let’s try ridge and LASSO on the car price data.

cd = read.csv("http://www.rob-mcculloch.org/data/usedcars.csv")
print(dim(cd))
## [1] 20063    11

Note that this version of the cars data has 20 thousand observations and 11 variables.

In addition many of the x variables are categorial so you will have to dummy them up.

sapply(cd,is.numeric)
##        price         trim   isOneOwner      mileage         year        color 
##         TRUE        FALSE        FALSE         TRUE         TRUE        FALSE 
## displacement         fuel       region  soundSystem    wheelType 
##         TRUE        FALSE        FALSE        FALSE        FALSE

displacement is actually categorical.

table(cd$displacement)
## 
##    3  3.2  3.5  3.7  4.2  4.3  4.6    5  5.4  5.5  5.8    6  6.3 
##  204  274  227  141  239 2787 2794 2661  356 9561  112  213  494

(a)

Use the LASSO to relate log of price to the features.

(b)

Use ridge regression to relate log of price to the features.

Note that is R, you use glmnet for LASSO and Ridge.

Here is the glmnet help on the parameter alpha.

alpha   :
The elasticnet mixing parameter, with 0 <= alpha <= 1 . The penalty is defined as
(1-alpha)/2||beta||_2^2+alpha||beta||_1.
alpha=1 is the lasso penalty, and alpha=0 the ridge penalty.