Drawing

Credit: https://www.meme-arsenal.com/en/create/meme/1868835

AUTOENCODER + CLASSIFIER, a breif example for Pytorch.

Author: Wei Ye, Shaowei Wu

This is my first pytorch coding post. It is based on my group mate Wei Ye's code in CSCI 8581.

In this notebook , I will show how to combine an autoencoder and a few classifiers to construct a multi-task model in pytorch step by step. The dataset comes from CSCI 8581.

There are some reference for further reading. Anyway, I will try to explain the details in 'one' page.

Define lists that contain the name of features and labels.

Before we move on, let's do a scaling for the feature data. There are a few methods to choose. The StandardScaler is adopted since it looks more "natural". In the real world, many physical quantities are normally distributed.

The StandardScaler will do a linear transformation: $$x'=\frac{x-\mu}{\sigma}$$ where $\mu$ and $\sigma$ are the mean and std of the original data.

For label data, we use LabelEncoder to convert the data type into integer.

Now use train_test_split to split the data into 2 parts for X and y. If you would like to have validation set, just apply train_test_split twice.

We will use CUDA to accelerate the training. For more up-to-date information, visit https://pytorch.org/docs/stable/notes/cuda.html.

A command like pip3 install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio===0.8.1 -f https://download.pytorch.org/whl/torch_stable.html as shown in the official Pytorch website would automatically install Pytorch along with CUDA and cuDNN. Note that the command will install all the dependencies for Pytorch on CUDA and Pytorch will not use system installations of CUDA and cuDNN. Say it only puts everything needed in the pip/conda environment. However, I do not think NVIDIA drivers will be installed. Visit the link below for explanation. https://discuss.pytorch.org/t/install-pytorch-gpu-with-pre-installed-cuda-and-cudnn/70808/2

Now let's build a simple FCNN classifier. We have to be careful about the dimensions of input and output. The classifier will receive a $N \times 16$ tensor from Autoencoder, which will be defined later. The output of classifier is a $N \times C$ tensor, where C is the number of classes for a single label and N is the number of records(minibatch). For example, if there are 4 classes (BTC, LTC, DOGE,ETH) for label "Crypto", then C will be 4 and the classifer will be used to determine which kind of Crypto each record belongs to. One may ask why we do not save the output in one column, say the output is of dimension $N \times 1$. The reason is that we use CROSSENTROPYLOSS as our loss function and it will preprocess the Input with LogSoftMax. Otherwise we may want to process the output of classifier with SoftMax, etc.

There are 69 feature columns in our dataset. So, the input of our autoencoder should be $N \times 69$.

The autoencoder will output a list of $N \times 16$ tensors for classifier and a $N \times 69$ tensor(dec_data) for reconstruction error. It contains two parts. The first part will compress the information, in a way pretty similar to PCA. The second part will reconstruct the information.

Drawing

A model_list should be pass to MLP_0.

Remember that Pytorch will not detect the parameters of the model inside a model (here we have 23 classifier models inside MLP_0). We have to pass a list of all the parameters to be optimized. If we pass list(model.parameters()) only, the optimizer will update the parameters of Autoencoder without parameters of classifiers.