Description
Misunderstanding of probability may be the greatest of all impediments to scienti c literacy.
Gould, Stephen Jay

Student Name:
Student Number:
Instructor George Tzanetakis
Question
Value
Mark
1
4
2
3
3
3
Total
10

Overview
So far in the assignments we have been mostly building things from \scratch” and not using existing libraries and frameworks as much. In this assignment you are asked to use existing tools to explore some of the ideas we have learned in class. I suggest software in Python for this purpose but if you can nd similar software in other languages that is ok. The topics covered are CSP, machine learning with Naive Bayes and Discrete Bayesian Networks.
Don’t hesitate to contact the instructor via email or utilize the chat room of the ConneX course website for any questions/clari cations you might need.
The questions are marked as (*) necessary to pass, (**) expected and (***) exceptional.
IMPORTANT: YOU SHOULD ONLY SUBMIT A SINGLE PDF FILE WITH YOUR REPORT THROUGH CONNEX. ANY OTHER FORMAT WILL NOT BE ACCEPTED. THE ANSWERS SHOULD BE LABELED BY THE CORRESPONDING QUESTIONS NUMBERS

Constraintsatisfaction problems (4pts)


Using the conventions of the CSP library you are using solve the Cryptarithmetic puzzle provided in Figure 5.2 of the AIMA textbook.(3pts (ONLY UGRADS) – **)



Similarly solve the problem of Waltz Filtering described in the following document http://www.cs.cityu.edu.hk/_{~}hwchun/research/ PDF/PAJava99Waltz.pdf using the library. There is a lot of information in the paper and it uses JSolver a java library for constraint programming. Your task is to transfer the constraints and problem described in the paper to pgmpy. This is easier than it might seem at rst read but it does require some more e ort than the simple example of the rst question which is why this question if for graduate students or undergraduates who want an A+. 4pts (GRADS – NO NEED TO IMPLEMENT QUESTION 1), 1pt (UGRAD).


Naive Bayes Text Classi cation using scikitlearn (3pts)
Using the movie review data from assignment 2 use the scikitlearn library to compute the classi cation accuracy using 10fold crossvalidation of the Naive Bayes Bernoulli and Naive Bayes Multinomial classi ers that are part of scikitlearn. Show your code and results. You will need to read the excellent documentation of scikitlearn for the details. (3pts – ***)

Discrete Bayesian Networks (3pts)
Consider how to compute the probability of a particular state for example the probability that the candidate has strong musicianship, the pieces played are not that di cult, the rating is 2 stars, the exam score is high, and the recommendation letter is weak. We can use the semantics of Bayesian networks

Di culty
Musicianship
Low
High
Weak
Strong
0:6
0:4
0:7
0:3
Di culty
Musicianship

Rating
Rating
Exam
Di culty Musicianship
*
**
***
Exam
Low
Weak
0:3
0:4
0:3
Musicianship
Low
High
Low
Strong
0:05
0:25
0:7
Letter
High
Weak
0:9
0:08
0:02
Weak
0:95
0:05
High
Strong
0:5
0:3
0:2
High
0:2
0:8

Letter
Rating
Weak
Strong
*
0.1
0:9
**
0.4
0:6
***
0.99
0:01
to compute this joint probability as follows:
P (m = strong)P (d = low)P (r = jm = strong; d = low)
P (e = highjm = strong)P (letter = weakj )

0:3 0:6 0:08 0:8 0:4


0:004608

For this question my suggestion is to use https://github.com/pgmpy/ pgmpy which is a Python Library for Probabilistic Graphical Models for this assignment. Your deliverable should show all the code you used to solve the problem. Your task is to explore this particular discrete Bayesian Network.

Using pgmpy (or some other simialr library) show how the query provided as an example can be computed using exact inference (1pt – **)

Consider a particular musician, George and we want to know how likely he will get a strong recommendation letter. Knowing nothing else this probability is about 50:2%. If we nd out that George is not a strong musician then that probability goes down to around 38:9%. Show how these numbers are derived using the software tool and exact inference (1pt – **)

Answer the previous two questions using some approximate inference algorithm. Comment on the di erence (1pt)
Just to be clear you DO NOT need to do the exact or approximate inference by hand but just use the tool with the right conventions to get the results.

Grading
The submission should be a single PDF containing code snippets, text and gures with your answers in order. The questions are worth a total of 10 points. Grading will be based on the content of the answers as well as the quality of the report.
4