Rice University logo
 
Top blue bar image
Kaggle competition
 

Starting on Galaxy Zoo

As a first stab at working on the Galaxy Zoo problem, I ran least squares on a variety of compressed image dimensions.

Here we can see that optimal results came from 32×32 images, but that 24×24 performed about the same. This is good news considering the training data size grows very fast with increased dimensionality. My next move here is to try matrix factorization/pca to get a better least squares result, and then try regularization.

I’ve also noted that some form of data augmentation will be important in my final submissions, but I’m working with the hypothesis that what works well without data augmentation will also work well with it. This allows me to avoid its computational costs during this more experimental period.

Comments are closed.