# Artificial Intelligence Homework 02 Solution

\$30.00

Category:

## Description

Recall from lecture1 that Sam is either t or un t

• = f t, un tg

and has to decide whether to exercise or relax

• = fexercise, relaxg

on the basis of the following (probability, reward)-matrices (p(s; a; s0); r(s; a; s0))

for row s, column s0 in table with corner a

 exercise t un t relax t un t t .99, 8 .01, 8 t .7, 10 .3, 10 un t .2, 0 .8, 0 un t 0, 5 1, 5 The -discounted value of (s; a) is lim qn(s; a)

n!1

where

q0(s; a) := p(s; a; t)r(s; a; t) + p(s; a; un t)r(s; a; un t)

Vn(s) := max(qn(s; exercise); qn(s; relax))

qn+1(s; a) := q0(s; a) + p(s; a; t)Vn( t) + p(s; a; un t)Vn(un t) :

In particular, = 0:9 leads to the following qn(s; a) for n = 0; 1; 2

 exercise relax t 8, 16.955, 23.812 10, 17.65, 23.685 relax, relax, exercise un t 0, 5.4, 10.017 5, 9.5, 13.55 relax, relax, relax

a positive integer n, a -setting G (0 < G < 1), and a state s

returns the values

qn(s; exercise) and qn(s; relax)

for = G. You may use any of the following programming languages

Prolog, Java, Python

but be prepared to demonstrate your program on Tue, March 6 (noon-1, LG 12, O’Reilly) or Wed, March 7 (10-11, LB04; on your machine).

error: Content is protected !!