Computer scienceData scienceMachine learningClassificationDecision tree learning

ID3 algorithm

Cactus spines

Report a typo

Imagine you have a dataset that contains different samples of cactuses, your task is to predict cactus subfamily. You'd like to find the entropy for the feature spines with two categorical values yes or no.

Below you find an excerpt from the cactus dataset that contains data about 12 samples, 9 of these have spines, 3 not. Your task is to calculate the entropy for the whole feature spines. Round your answer to 2 decimal places.

Name

Spines

Subfamily

Browningia

yes

Cactoideae

Pterocereus

yes

Cactoideae

Kimnachia

no

Cactoideae

Maihuenia poeppigii

no

Cactoideae

Opuntia

yes

Opuntioideae

Grusonia

yes

Opuntioideae

Tunilla

yes

Opuntioideae

Maihueniopsis

yes

Opuntioideae

Tacinga

yes

Opuntioideae

Cumulopuntia

yes

Opuntioideae

Pereskiopsis

yes

Opuntioideae

Pereskia

no

Pereskioideae

Tip: Calculate E(Xspines = yes)E(X|\text{spines = yes}), E(Xspines = no)E(X|\text{spines = no}), E(Xspines)E(X|\text{spines})

Enter a number
___

Create a free account to access the full topic