ESTR 3108 Review
Agents
Agents include humans,robots, softbots, thermostats, etc.
The agent function maps from percept histories(sequences) to actions $P->A$
agent program runs on the physical architecture to produce f. (prescode)
- Percepts :sensor reading
- Actions : output
- Goals : performance measure ( to achieve the goal )
- Environment : external world
Rational agent: For each possible percept sequence, a rational agent should choose an action
that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge (environment) (capabilities(actions)) it has
Basic types of agents in order of increasing generality:
Simple reflex agents
Agents with condition-action rules
Model based Reflex agents with state
Simple agents that maintain some sort of internal state(arg: state,action,percept,model) of the world (and a model) in order to choose an action.
Model-based, Goal-based agents
Knowing about the current state of the environment is not always enough to decide what to do.The agent need a GOAL describing the desirable situations.
Model-based, Utility-based agents
Goals alone are not really enough to generate high-quality behavior. (eg: quicker? safer? cheaper?) }We
can only achieve some utilities, or have a compromise action.
Learning Agents
The learning element responsible for making improvements to the performance element.
The performance element = previously entire agent: it takes in percepts and decides on actions.
The critic tells the learning element how well the agent is doing w.r.t. a fixed standard of performance. Necessary because the percepts themselves provide no indication of agent’s success.
The problem generator - responsible for suggesting problems & actions for new and informative experiences.
Problem-solving agents
- Goal Formation
- Problem formulation: to decide what actions and states to consider
- Search
- Execution
4 problem types
single-state problems
deterministic, fully observable: consider a single state at a time
Multiple state problems
non-observable: reason about sets of states that it might get to.
Contingency(偶然性) problems
nondeterministic and/or partially observable: calculates a whole tree of actions rather than a single action sequence
Exploration problems
unknown state space: agent must experiment, discovering the actions and states.
state space : The set of all states reachable from the initial state by any sequence of actions
Tree Search alg
Basic idea:
Current state - generate a new set of states (expanding the state) - choose one (expanding state) by a search strategy to expand - search tree with search nodes
Evaluation methods
- completeness. Always find a solution if exists?
- time complexity: number of nodes
- space complexity: maximum number of nodes in memory
- optimality: always find a least-cost solution?
BFS
- FIFO queue
- complete
- time and space: O($b^{d+1}$) b: branching factor, d:depth of optimal solution
- optimal(if cost = 1 per step)
Uniform cost search
- expand the lowerst-cost node rather than lowest-depth node.
- complete
- time: number of nodes with g <= cost of optimal solution. g: path cost
- space: number of nodes with g <= cost of optimal solution
- optimal
DFS
- LIFO queue
- not complete, fail in infinite depth spaces
- O($b^m$) m: max depth of search tree, b: branching factor
- space: O(bm)
- optimal if cost same per step
Depth limited search
depth first search with depth limit L
Iterative deepening search
iterate limit L from 0 to infinite
- complete
- time: O($b^d$)
- space: O(bd)
- optimal if cost same per step
Best first search
Nodes are ordered so that the best evaluation is expanded first.
Greedy search
- Evaluation function = estimation of cost from node n to the closest goal, h(n)
- h can be any function as long as h(n) = 0 when n = goal such as 直线距离
- greedy search: best first search that uses h to select the next node to expand
- not complete(stuck in loop)
- time, space: O($b^m$), m: max depth of search tree ( keep all node in memory )
- not optimal
A search
- evaluation function f = g(n) + h(n) g:cost so far to reach n
- 0<=h(n)<= true cost from n to goal
- complete, unless inifinity nodes with f<= f(Goal)
- time, exponential in [error in h * length of solution]
- space. keep all nodes
- optimal
- no other optimal alg is is guaranteed to expand fewer nodes than A (没expand all f 小于goal的 node, 就有可能miss optimal solution)
Dominance
if h2 > h1 for all n, then h2 dominates h1 and is better
$N = 1 + b^1 + b^2 + … + b^d $
N:total number of expanded nodes, b: effective branching factor d: solution depth. use b to determine the h’s overall usefulness.
Recursive Best First Search RBFS
1.Deterministic2.Stochastic
Single point Hill-climbing Simulator A
Multiple poi Multi-start GA
Mul: Send whole class to search
Hill climbing
如果邻居最大值 < current -> return
else current = 邻居最大值
problem: stuck on local max
Simulated annealing
- escape local max by allowing some “ bad “ moves ( next = randomly selected) but gradually decrease their size and frequency
- if next = better, always excuted
- $p = e^{\Delta E/T}$
Genetic algorithms
选择初始生命种群(initial population)
评价种群中的个体适应度(fitness function)
以比例原则(分数高的挑中机率也较高)选择产生下一个种群。不仅仅挑分数最高的的原因是这么做可能收敛到局部的最佳点,而非整体的。
改变该种群(交叉和变异)(crossover mutation)
直到停止循环的条件满足
Game play
Minimax
- complete if tree is finite
- optimal if opponent is optimal
- time :$O(b^m)$
- space: O(bm) depth first
Resource limits: cutoff test: depth limit ; evaluation function:= estimated desirability of position
Evaluation function
- agree with the utility function on terminal states
- not too long
- accurately reflect the actual chances of winning
Qui’escence search
Expand non-quiescent(静) positions until quiescent positions are reached.
Alpha-Beta pruning
compute the correct minimax decision without looking at every node in the search tree.
perfect order: time = $O(b^{m/2})$
Logic
Syntax defines the sentences in the language
Semantics define the meaning of sentences
m is a model of a sentence s if s in true in m
M(s) = set of all models of s
KB entails s iff M(KB) belongs to M(s)
- Soundness: i is sound if whenever KB entails(i) s, then KB entails s
- Completeness: i is complete if whenever KB entails s, then Kb entails(i) s. Derivations can produce all entailed sentences
Valid: true in all models
Satisfiable: true in some models
Unsatisfiable: true in no models
Horn Form
conjunction of Horn clauses
Horn clauses
proposition symbol || conjucton of symbols -> symbol
Backward chaining
1.check if q is known already
2.prove by BC all premises of some rules concluding q
- Forward chaining
- data-driven
- may do lots of works that is irrelevant to the goal
- Backward chaining
- goal-driven
- complexity of BC can be much less than linear in size of KB
Resolution
CNF-universal
conjunction of “disjunctions of literals”
- $ a\leftrightarrow b\Rightarrow(a\rightarrow b) \wedge(b\rightarrow a)$
- $a\rightarrow b\Rightarrow \neg a \vee b$
- $ \neg( a \vee b)\Rightarrow \neg a \wedge \neg b$
- $(a \wedge b) \vee c \Rightarrow (a \vee c) \wedge ( b \vee c)$
Resolution inference rule
sound and complete
To proof Kb entails a , by contradiction, show that Kb and not a unsatisfiable
First Order Logic
- Terms: Function(Term,…) | constant | variable
- atomic sentences: a predicate symbol followed by a parenthesized list of terms.(also an equality)
- complex sentences: logical connectives (and or)
- Quantifier, for all, exist one
well formed formula/wff: for sentences that have all their variables properly introduced.
FOL: one can quantify over objects but not over relations or functions on these object
Higher order logic : quantify over relations and functions
$(\lambda x,y\ x^2-y^2)(25,24) = 25^2 - 24^2 = 49$
E! = unique
Universal instantiation / Existential instantiation
UI can be applied several times to add new sentences, the new KB is logically equivalent
to the old
EI can be applied once to replace the existential sentence, the new KB is not equivalent to the old.
Unification
UNIFY(p,q) = theta where SUBST(theta,p) = SUBST(theta, q)
Backward channing and FC
- BWC
- space is linear in size of proof
- incomplete due to infinite loops(checking current goal against every goal on stack)
- inefficient due to repeated sub goals ( caching previous results)
- FWC
- sound and complete
- may not terminate in general if alpha is not entailed
- entailment with definite clauses is semi-decidable
Learning From Examples
Decision Tree not a good way to represent the function because the truth table is exponentially large in the number of attributes
- If there are no examples left, it means that no such example has been observed, and we return the majority classification of the node’s parent.
- If there are no attributes left, but both +ve and -ve examples, which means that these examples have exactly the same description, but different classifications. This happens when
- noise
- not fully describe
- the domain is non-deterministic
**Ockham’s razor.**The most likely hypothesis is the simplest one that is consistent with all observations.
Information content I
$I(P(v_1),…,P(v_n)) = \sum-P(v_i)log_2P(v_i)$
P is smaller, then -log P is bigger
$Gain(A) = I(p/(p+n), n/(n+p)) - Reminder(A)$
$Reminder(A) = \sum\frac{p_i+n_i}{p+n}I(p_i/{p_i+n_i},n_i/p_i+n_i)$
We select features with largest Gain.
- Fasle negative: the hypothesis must generalization: C2(x) implies C1(x). C2 is to be generalized
- Dropping conditions || add OR conditions
- False positive: specialization. special = contradiction
Sample Size: $m >= 1/e ( ln 1/o + ln|H|)$
e = max error o = error’s probablity H= number of possible hypothesis