- Learning involves acquiring general concepts from specific training examples. Example: People continually learn general concepts or categories such as "bird," "car," "situations in which I should study more in order to pass the exam," etc.
- Each such concept can be viewed as describing some subset of objects or events defined over a larger set
- Alternatively, each concept can be thought of as a Boolean-valued function defined over this larger set. (Example: A function defined over all animals, whose value is true for birds and false for other animals).
Definition: Concept learning - Inferring a Boolean-valued function from training examples of its input and output
A CONCEPT LEARNING TASK
Consider the example task of learning the target concept "Days on which Aldo enjoys his favorite water sport”
Example
Example |
Sky |
AirTemp |
Humidity |
Wind |
Water |
Forecast |
EnjoySport |
1 |
Sunny |
Warm |
Normal |
Strong |
Warm |
Same |
Yes |
2 |
Sunny |
Warm |
High |
Strong |
Warm |
Same |
Yes |
3 |
Rainy |
Cold |
High |
Strong |
Warm |
Change |
No |
4 |
Sunny |
Warm |
High |
Strong |
Cool |
Change |
Yes |
Table: Positive and negative training examples for the target concept EnjoySport.
The task is to learn to predict the value of EnjoySport for an arbitrary day, based on the values of its other attributes?
What hypothesis representation is provided to the learner?
- Let’s consider a simple representation in which each hypothesis consists of a conjunction of constraints on the instance attributes.
- Let each hypothesis be a vector of six constraints, specifying the values of the six attributes Sky, AirTemp, Humidity, Wind, Water, and Forecast.
- Indicate by a "?' that any value is acceptable for this attribute,
- Specify a single required value (e.g., Warm) for the attribute, or
- Indicate by a "Φ" that no value is acceptable
If some instance x satisfies all the constraints of hypothesis h, then h classifies x as a positive example (h(x) = 1).
The hypothesis that PERSON enjoys his favorite sport only on cold days with high humidity is represented by the expression
(?, Cold, High, ?, ?, ?)
The most general hypothesis-that every day is a positive example-is represented by
(?, ?, ?, ?, ?, ?)
The most specific possible hypothesis-that no day is a positive example-is represented by
(Φ, Φ, Φ, Φ, Φ, Φ)
Notation
- The set of items over which the concept is defined is called the set of instances, which is denoted by X.
Example: X is the set of all possible days, each represented by the attributes: Sky, AirTemp, Humidity, Wind, Water, and Forecast
- The concept or function to be learned is called the target concept, which is denoted by c.
c can be any Boolean valued function defined over the instances X
c: X→ {O, 1}
Example: The target concept corresponds to the value of the attribute EnjoySport
(i.e., c(x) = 1 if EnjoySport = Yes, and c(x) = 0 if EnjoySport = No).
- Instances for which c(x) = 1 are called positive examples, or members of the target concept.
- Instances for which c(x) = 0 are called negative examples, or non-members of the target concept.
- The ordered pair (x, c(x)) to describe the training example consisting of the instance x and its target concept value c(x).
- D to denote the set of available training examples
- The symbol H to denote the set of all possible hypotheses that the learner may consider regarding the identity of the target concept. Each hypothesis h in H represents a Boolean-valued function defined over X
h: X→{O, 1}
The goal of the learner is to find a hypothesis h such that h(x) = c(x) for all x in X.
Given:
- Instances X: Possible days, each described by the attributes
- Sky (with possible values Sunny, Cloudy, and Rainy),
- AirTemp (with values Warm and Cold),
- Humidity (with values Normal and High),
- Wind (with values Strong and Weak),
- Water (with values Warm and Cool),
- Forecast (with values Same and Change).
- Hypotheses H: Each hypothesis is described by a conjunction of constraints on the attributes Sky, AirTemp, Humidity, Wind, Water, and Forecast. The constraints may be "?" (any value is acceptable), “Φ” (no value is acceptable), or a specific value.
- Target concept c: EnjoySport : X → {0, l}
- Training examples D: Positive and negative examples of the target function
Determine:
- A hypothesis h in H such that h(x) = c(x) for all x in X.
Table: The EnjoySport concept learning task.
The inductive learning hypothesis
Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples.
0 Comments