Naïve Bayes for Continuous Attributes

This is a continued topic on Naïve Bayes for Categorical Attributes. Naïve Bayes works easily with categorical attributes by counting how often each value occurs for each class. But when the attribute is numeric (e.g., temperature = 66), you can’t count exact matches, because:

A numeric value may be unique in the dataset
The probability of any exact real number (like exactly 66) is effectively zero

So instead of using counts, we model the distribution of numeric values using probability density function (PDF).

We would like to classify the following new example:
outlook=sunny, temperature=66, humidity=90, windy=true

How to calculate
$P (t e m p er a t u re = 66∣ yes) = ?$ , $P (h u mi d i t y = 90∣ yes) = ?$ ; and $P (t e m p er a t u re = 66∣ n o) = ?$ , $P (h u mi d i t y = 90∣ n o) = ?$

Using a PDF (Typically Gaussian)

A Probability Density Function (PDF) describes how likely a continuous random variable is to take on a particular range of values.

Unlike categorical variables (which have probabilities like “3/10 chance of rain”), continuous variables (like temperature, height, weight) don’t have probabilities for exact values — because the probability of any single point is zero (e.g. 66.000000…).

Instead, we use the PDF to say:

How dense the probability is near a value (e.g., around 66)
The area under the curve between two values (like 65.5 and 66.5) gives the actual probability

Assumption We assume the numeric attribute values follow a normal (Gaussian) distribution for each class.

For a normal distribution with mean $μ$ and standard deviation $σ$ , the probability density function is: $f (x) = \frac{1}{σ 2 π} e^{- \frac{( x - μ ) ^{2}}{2 σ ^{2}}}$

$x$ = the numeric value you’re evaluating (e.g., 66)

Calculating probabilities using PDF

Given the example: outlook=sunny, temperature=66, humidity=90, windy=true.

We want to find $P (yes ∣ E)$ and $P (n o ∣ E)$ . Recall the Bayes Theorem: $P (H ∣ E) = \frac{P ( E ∣ H ) P ( H )}{P ( E )}$ and we split the evidence $E$ into 4 smaller pieces of evidence using the Naive Bayes’s independence assumption: $P (yes ∣ E) = \frac{P ( o u tl oo k = s u nn y ∣ yes ) P ( t e m p er a t u re = 66∣ yes ) P ( h u mi d i t y = 90∣ yes ) P ( w in d y = t r u e ∣ yes ) P ( yes )}{P ( E )}$ We already known from the Example - Weather that:

$P (o u tl oo k = s u nn y ∣ yes) = 2/9$
$P (E 4∣ yes) = P (w in d y = t r u e ∣ yes) = 3/9$
$P (yes) = 9/14$

We are now to find:

$P (t e m p er a t u re = 66∣ yes)$
$P (h u mi d i t y = 90∣ yes)$

Solve for $P (t e m p er a t u re = 66∣ yes)$

We know from the training data:

Mean ( $μ$ ) for temp for class=yes = 73
Standard deviation ( $σ$ ) for temp for class=yes = 6.2

Now we can calculate $P (t e m p er a t u re = 66∣ yes)$ using the PDF:

f (x) f (temperature = 66∣ yes) = \frac{1}{σ 2 π} e^{- \frac{( x - μ ) ^{2}}{2 σ ^{2}}} = \frac{1}{6.2 2 π} e^{- \frac{( 66 - 73 ) ^{2}}{2 \times 6. 2 ^{2}}} = 0.034

Solve for $P (h u mi d i t y = 90∣ yes)$

Similarly we can solve $P (h u mi d i t y = 90∣ yes)$ using the same steps. $P (h u mi d i t y = 90∣ yes) = 0.0221$

Compare $P (yes ∣ E)$ and $P (n o ∣ E)$

We substitute the values we obtained to the equations:

P (yes ∣ E) = \frac{P ( o u tl oo k = s u nn y ∣ yes ) P ( t e m p er a t u re = 66∣ yes ) P ( h u mi d i t y = 90∣ yes ) P ( w in d y = t r u e ∣ yes ) P ( yes )}{P ( E )} = \frac{\frac{2}{9} \cdot 0.032 \cdot 0.0221 \cdot \frac{3}{9} \frac{9}{14}}{P ( E )} = \frac{0.000036}{P ( E )}

Similarly, solve for $P (n o ∣ E)$ :

P (n o ∣ E) = \frac{P ( o u tl oo k = s u nn y ∣ n o ) P ( t e m p er a t u re = 66∣ n o ) P ( h u mi d i t y = 90∣ n o ) P ( w in d y = t r u e ∣ n o ) P ( n o )}{P ( E )} = \frac{\frac{3}{5} \cdot 0.0291 \cdot 0.038 \cdot \frac{3}{5} \frac{5}{14}}{P ( E )} = \frac{0.000136}{P ( E )}

We can conclude that: Since: $P (yes ∣ E) < P (n o ∣ E)$ For the new day play = no is more likely than play = yes.

Back to parent page: Supervised Machine Learning

AI Machine_Learning COMP3308 Unsupervised_Learning Eager_Learning Classification Naïve_Bayes Continuous_Attributes Probability_Density_Function

Computer Science Wiki

Explorer

Naïve Bayes for Continuous Attributes

Using a PDF (Typically Gaussian)

Calculating probabilities using PDF

Solve for $P (t e m p er a t u re = 66∣ yes)$

Solve for $P (h u mi d i t y = 90∣ yes)$

Compare $P (yes ∣ E)$ and $P (n o ∣ E)$

Graph View

Table of Contents

Backlinks

Computer Science Wiki

Explorer

Naïve Bayes for Continuous Attributes

Using a PDF (Typically Gaussian)

Calculating probabilities using PDF

Solve for P(temperature=66∣yes)

Solve for P(humidity=90∣yes)

Compare P(yes∣E) and P(no∣E)

Graph View

Table of Contents

Backlinks

Solve for $P (t e m p er a t u re = 66∣ yes)$

Solve for $P (h u mi d i t y = 90∣ yes)$

Compare $P (yes ∣ E)$ and $P (n o ∣ E)$