Where are we and what should I be doing?


April 1:

Looking at Python tree code.
About to start neural nets.


March 25:

About to do Random Forests.


March 18:

We are about to do the lift curve and ROC/AUC.


March 16:

About to start We8there example of regularized logistic regression.


March 11:

Next time we start at logit.

No homework due at this time.


March 4:

We are about to do the simplest version of the LASSO.
We just did the Lp penalization slide.

Let's make homework 4 due March 11.


February 25:

About to do Section 4, AIC in Linear Models and Regularization.




February 18:

Stopped at proj y on x in mle/opt notes.

Homework 3 extended to Monday, February 22.


February 11:

We stopped at 2. Expection, Mean, Variance, Covariance.

Homework 3 is on the webpage and is due February 19.


February 9:

About to start the last section (7) of the Knn-bias-variance notes.

Homework 2 due this Friday, February 12.


February 4:

About to start cross-validation in the KNN, Bias-Variance Tradeoff notes.
Homework 2 on webpage, due February 12.


February 2:

About to start section 4, Bias Variance Tradeoff in the KNN notes.


January, 28:

Finished NB.


January, 26:

Just about to (finally!) do Naive Bayes.

Note that there was a bad typo in the homework1.
You want to see if year provides addition information given that mileage is in the model.


January, 21:

Just started on odds ratio version of Bayes Theorem for binary y.

Homework 1 is on the webpage and it is due February 2.


January, 19:

Finished the R Hello world.

Next time we will start the notes on Probability review and Naive Bayes.


January, 14:

We finished the Python Hello World (skipping the matrix stuff at the end).
Next we will look at R.
In particular, we will look at the R Hello World on the R webpage which parallels the Python version.

Right now you need to be deciding what software you will use to take
the class. For example, we are looking at R and python.
Notable alternatives are Matlab and Julia, put I don't know if they support
all the tools we need.

If you have to learn R, have a look at the links on the webpage.
I think swirl is the easiest way to go.

I'm not sure what is the best way to learn python.
Some of the links I have on my python page look pretty good, for example the
A Whirlwind Tour of Python, by Jake VanderPlas (A Whirlwind Tour of Python, I really like the book).
Again, the help on the Python/Numpy/Pandas/Scikit Learn/ pages is pretty impressive
and the help links in the Jupyter Notebook look great as well.

For my research I use a combination for R and C++.
I just been picking up Python "randomly".
Overall, I think R is easier, but if you have a programming background you might have a preference for Python.
Clearly, R has more statistics, but Python has scikitlearn and the neural net stuff seems to more a python thing.