Programming Assignment 2: Neural Network Classifier Solution




In this homework assignment, you need to implement a simple 2-hidden-layer Multi-Layer Neural Network using Python and Numpy. You are given data generated from three blackboxes: blackbox21, blackbox22, and blackbox23. The description and tasks for each blackbox are the same. The following instruction is for blackbox21 as an example, and you can apply the same to the blackbox22 and blackbox23 as well.

1. Homework Description:

Please use Python 3 to implement your homework assignment.

1.1 Data Description

In this assignment, you are given a set of training data and a set of testing data generated from a specific blackbox, say blackbox21, which you may not know the secret function inside. You can use the training set to develop your network and use the testing set to measure the accuracy of your network during the development. For grading, we will generate our hidden testing data from the same blackbox to test the performance of your submitted NN.

For blackbox21, you are given:

  • blackbox21_train.csv: labeled training data generated from a blackbox21

  • blackbox21_test.csv: unlabeled testing data

  • blackbox21_example_predictions.csv: example output, which is also the true class labels for blackbox21_test.csv so you can get to know the format of output and measure your model’s performance while you are developing your program.

This data format is similar to the last homework assignment. However, The label (target value) of data in this homework assignment is​​not binary. So this means in this homework assignment, you need to implement a Multi-Class Classifier. Meanwhile, there are three featuresin the blackboxes, not two.

1.2 Task Description

Your task is to implement a multi-hidden-layer neural network learner (see 1.3 model description part for details of neural network you need to implement), named as, that will


INF552 Machine Learning for Data Science

  1. Construct a neural network classifier from the given training data,

  1. Use the learned classifier to classify the unlabeled test data, and

  1. Output the predictions of your classifier on the test data into a file named blackbox2*_predictions.csv in the same directory as the .py,

  1. Finish in 5 minutes (to train one model for one blackbox).

Your program will take two input files and produce one output file as follows:

python3 training_data_path testing_data_path

  • prediction_file

For example,

python3 blackbox21_train.csv blackbox21_test.csv

  • blackbox21_predictions.csv

Note: input files may not be in the same directory as your python script1.

In other words, your algorithm file NeuralNetwork.pywill take labeled training data, unlabeled testing dataas input, and output your classification predictions on testing data as output. In your implementation, please do not use any existing machine learning library call. You must implement the algorithm yourself. Please develop your code yourself and do not copy from other students or from the Internet.

The format of blackbox2*_train.csvlooks like:

x1, x2, x3, y

Where x1, x2, andx3are the attribute values and yis the label, and blackbox2*_test.csvare unlabeled. Notice that the data here has 3 (not 2) attributes.

Your output blackbox2*_predictions.csv​​will look like





  • (A single column indicates the predicted class labels for each unlabeled sample in the input test file)

The format of your blackbox2*_ predictions.csvfile is crucial. It has to be in theexact same name and formatso that it can be parsed correctly to compare with true labels by grading scripts.

When we grade your algorithm, we will use the same training data but some unlabeled hidden testing data (generated from the same blackbox21) instead of the testing data that was given to you. Your code will be autograded for technical correctness. Please name your file correctly, or

  • os.path.basename(path)​​can be used to extract the base name of pathnamepath.


INF552 Machine Learning for Data Science

you will wreak havoc on the autograder. The maximum running time to train a model is 5 minutes (for a single blackbox), so please make sure your program finishes in 5 minutes.

1.3 Model Description

The basic structure model of neural network in this homework assignment is as below.

The above figure shows a 2-hidden-layer neural network. At each hidden layer, you need to use an activation function (e.g. relu function, see references below). Since it is a multi-class classification problem, you need to usesoftmax function(see references below) as activation at the final output layer to generate probability distribution of each class. There is no specific requirement on the number of nodes in each layer, you need to choose them to make your neural network reach best performance. Also, the number of nodes in the input layer should be number of features, and the number of nodes in output layer should be number of classes.

You are encouraged to implement more than 2-layer neural networks if you want, but in this homework assignment 2-layer neural network is enough to get good performance.

There are some hyper parameters you need to tune to get better performance. You need to find best hyper-parameters so that your neural network can get good performance on test and hidden test data.

  • Learning rate: step size for update weights (e.g. weights = weights – learning * grads), different optimizer have different way to use learning rate. (see reference in 2.1)

  • Batch size: number of samples processed each time before the model is updated. The size of a batch must be more than or equal to one, and less than or equal to the number


INF552 Machine Learning for Data Science

of samples in the training dataset. (e.g suppose your dataset is of 1000, and your batch size is 100, then you have 10 batches, each time you train one batch (100 samples) and after 10 batches, it trains all samples in your dataset)

  • Number of epoch: the number of complete passes through the training dataset (e.g. you have 1000 samples, 20 epochs means you loop this 1000 samples 20 times, suppose your batch size is 100, so in each epoch you train 1000/100 = 10 batches to loop the entire dataset and then you repeat this process 20 times)

  • Number of units in each hidden layer

Remember that the program has to finish in 5 minutes, so choose your hyper-parameters wisely.


[1] Relu function –

[2] Softmax function –

2. Implementation Guidance

2.1 Suggested steps

  1. Split dataset to batches (see 2.3 – there is a helper function to help you generate batches)

  2. Initialize weights and bias(see reference and 2.3) – suggest to use Xavier initialization

  3. Select one batch of data​​and calculate forward pass– follow the basic structure of neural network to compute output for each layer, you might need to cache output of each layer for convenient of backward propagation.

  1. Compute loss function– you need to use cross-entropy(logistic loss – see references) as loss function(see 2.3 – there is a helper function called log_loss to compute this)

  1. Backward propagation – use backward propagation (your implementation) to update hidden weights (see 2.2)

  1. Updates weights using optimization algorithms– there are many ways to update weighs see reference to find different optimizers(see 2.3 for SGD and Adam helper functions)

  1. Repeat 2,3,4,5,6 for all batches– after finishing this process for all batches (it just iterates all data points of dataset), it is called ‘one epoch’.

  1. Repeat 2,3,4,5,6,7 number of epochs times– You might need to train many epochs to get a good result.


INF552 Machine Learning for Data Science


  1. cross-entropy loss(logistic loss):

  1. Optimizer for neural network

  1. Weights initialization

    • Example of Xavier initializations


INF552 Machine Learning for Data Science

References for Backward propagation







[7] gation-e3c1a5a1e536

Feel free to find more, there are a lot of tutorials on the web.

2.2 Helper functions

We will provide you with a bunch of helper functions. You might need to understand them to use since it needs to be compatible to your program design.

1. relu()– it is used to compute output relu function.

  1. inplace_relu_derivative()– it is used to compute derivative of relu function

  1. softmax()– it is used to compute softmax function.

  1. log_loss()​​– it is used to compute logistic loss (cross entropy)

  1. SGDOptimizer Class implements stochastic gradient descent optimizer algorithm with momentum.

  1. AdamOptimizer Class– implements adam optimizer algorithm.

  2. gen_batches()– split dataset to generate small batches.

  1. label_binarize()– binarize label for multiclass

There are some more functions that you may use. See utils.pyfor details.

2.3 Learning Curve Graph

In order to make sure your neural network actually learns something, You need to make a plot as below to show the learning process of your neural networks. After every epoch (one epoch means going through all samples of data once), you need to record your accuracy of training set and validation set (it is just the test set we give you) and make a plot of those accuracy.


INF552 Machine Learning for Data Science

3. Submission:

Your submission contains 2 parts.

3.1 Submit your code to Vocareum

  • Submit your to​ ​Vocareum, with other related scripts that you may need (e.g.

  • After your submission, Vocareum would run two scripts to test your code, a submission script and a grading script. The submission script will test your code with only blackbox21through the training and test data we give you, while the grading script will test your code with all the 3 blackboxesthrough our hidden test data.

  • The submission script has a time limit of 5 minutes(for 1 blackbox). The grading script has a time limit of 15 minutes(for 3 blackboxes). The program would terminate and fail if it exceeds the time limit.

  • After the submission script/grading script finishes, you can view your submission report immediately after the program finishes to see if your code works. The grading report will be released after the deadline of the homework. Only test accuracy for blackbox21 will be displayed in the submission report.

  • You don’t have to keep the page open while the scripts are running.

3.2 Submit your report to Blackboard

    • Create a single .zip( which contains:

      • You may also submit other related scripts.

      • Report.pdf, a briefreport indicates yourtraining accuracy, testing accuracy andlearning curve graph for blackbox21, blackbox22, and blackbox23.

    • Submit your zipped file to the blackboard.

  1. Rubric:

100 points in total

  • Program correctness(60 points): your program always works correctly and meets the specifications within the time limit.

  • Documentation(20 points): your code is well commented and the submitted learning curve graph looks reasonable.

  • Performance (20 points):

    • overall good performance on reported train and test accuracy. (10 points)

    • accuracy on hidden testing data. (10 points)


INF552 Machine Learning for Data Science


INF552 Machine Learning for Data Science


Homework must be submitted through Vocareum. Please only upload your code to the /work directory. Don’t create any subfolder or upload any other files. Please refer to get started with Vocareum.

How to submit

  1. Log in to Vocareum website and you should see Assignment 1under INF 552 Course

  1. Click My Work

  1. Select workfolder on the left

  1. You can either create new files or upload files from your local machine. Make sure to name your python file correctly based on homework description

5. Click on Submitand press Yes. You are able to submit as many times as you want

You can test your code on your local machine in the way described in the homework handout. On vocareum, you only need to Submit, then thePassed/Failed” message will appear automatically. You can check your submission results in the Terminal or under Details tab. If your code works correctly, you should see the results as below:

Submission Report

[Executed at: Sun Sep 8 18:32:08 PDT 2019] ================================================== blackbox21 passed, test accuracy: 90.0 ==================================================

Here, test accuracy is based on the test data given to you, so you can compare this with the results on your local machine to make sure everything works properly on Vocareum.

After the deadline, your finalsubmissionwill be graded using hidden test data, and you will be able to see your grades on Vocareum.


error: Content is protected !!