Where are we and what should I be doing?
April 1:
Looking at Python tree code.
About to start neural nets.
March 25:
About to do Random Forests.
March 18:
We are about to do the lift curve and ROC/AUC.
March 16:
About to start We8there example of regularized logistic regression.
March 11:
Next time we start at logit.
No homework due at this time.
March 4:
We are about to do the simplest version of the LASSO.
We just did the Lp penalization slide.
Let's make homework 4 due March 11.
February 25:
About to do Section 4, AIC in Linear Models and Regularization.
February 18:
Stopped at proj y on x in mle/opt notes.
Homework 3 extended to Monday, February 22.
February 11:
We stopped at 2. Expection, Mean, Variance, Covariance.
Homework 3 is on the webpage and is due February 19.
February 9:
About to start the last section (7) of the Knn-bias-variance notes.
Homework 2 due this Friday, February 12.
February 4:
About to start cross-validation in the KNN, Bias-Variance Tradeoff notes.
Homework 2 on webpage, due February 12.
February 2:
About to start section 4, Bias Variance Tradeoff in the KNN notes.
January, 28:
Finished NB.
January, 26:
Just about to (finally!) do Naive Bayes.
Note that there was a bad typo in the homework1.
You want to see if year provides addition information given that mileage is in the model.
January, 21:
Just started on odds ratio version of Bayes Theorem for binary y.
Homework 1 is on the webpage and it is due February 2.
January, 19:
Finished the R Hello world.
Next time we will start the notes on Probability review and Naive Bayes.
January, 14:
We finished the Python Hello World (skipping the matrix stuff at the end).
Next we will look at R.
In particular, we will look at the R Hello World on the R webpage which parallels the Python version.
Right now you need to be deciding what software you will use to take
the class. For example, we are looking at R and python.
Notable alternatives are Matlab and Julia, put I don't know if they support
all the tools we need.
If you have to learn R, have a look at the links on the webpage.
I think swirl is the easiest way to go.
I'm not sure what is the best way to learn python.
Some of the links I have on my python page look pretty good, for example the
A Whirlwind Tour of Python, by Jake VanderPlas
(A Whirlwind Tour of Python, I really like the book).
Again, the help on the Python/Numpy/Pandas/Scikit Learn/ pages is pretty impressive
and the help links in the Jupyter Notebook look great as well.
For my research I use a combination for R and C++.
I just been picking up Python "randomly".
Overall, I think R is easier, but if you have a programming background you might have a preference for Python.
Clearly, R has more statistics, but Python has scikitlearn and the neural net stuff seems to more a python thing.