ARl711S
First Exam (continued)
June 2022
• coord(x, y, z): tile xis located at row y and column z.
Following the STRIPSnotation, define the actions as operators for the planner.
(b) Express the goal (in Figure 2b) as a well-formed formula.
[5]
(c) Using the Manhattan distance between tiles (the sum of the absolute values of the row
[18]
difference and column difference) as a heuristic and the number of steps as path cost,
generate the plan using the A* search strategy. You will explicitly indicate the evaluation
step-by-step during the plan generation.
Question 3 ..................................................................
[20 points]
The Millionaire is your favourite TV show. It is a ten-round game. Except for the first round,
the player can choose to play or quit at each round. When the player quits, the game ends,
and s/he can collect the rewards that s/he has earned so far. When the player plays, s/he
can succeed and move to the next round or fail, leading to the end of the game. Note that
s/he loses all the rewards s/he has accumulated so far in the event of a failure. Note also that
when the player reaches the last round, whether s/he plays or not the game ends with the
appropriate reward.
Model this problem as a Markov Decision process and find the optimal policy using the value
iteration approach. You will indicate the utility values during each iteration. You will use a
discount factor of 0.95.
Table 1: Millionaire - Rewards and success probability
Round Success Probability
1
0.99
2
0.9
3
0.8
4
0.7
5
0.6
6
0.5
7
0.4
8
0.3
9
0.2
10
0.1
Reward
10
50
100
500
1000
5000
10000
50000
100000
500000
Question 4 ..................................................................
[18 points]
(a) Company Z operates near a river which it pollutes. This has harms the fishermen who
[8]
fish from the river. Company Z's profit is P. Normally, the fishermen get a profit of A.
However, with the negative effect on the river, they now lose Ao (A > A0) from their
profit. The fishermen and Company Z engage in litigation. Both teams will simultane-
ously indicate an amount. For Company Z, the amount represents their claim about the
company's profit, while for the fishermen, the amount represents their profit loss. If
the company's profit is less than the fishermen's lost profit, Company Z will shut down.
Otherwise, it will pay a settlement to the fishermen corresponding to their lost profit
Page 2 of 3
Please turn over to the next page...