# B. A dataset has 1000 records and 50 variables with 5% of the values missing, spread randomly throughout the records and variables. An analysis decides to remove records that have missing values. About how many records would you expect would be removed?

B.   A dataset has 1000 records and 50 variables with 5% of the values missing, spread randomly throughout the records and variables. An analysis decides to remove records that have missing values. About how many records would you expect would be removed? (20 points)

C. Given a database table containing weather data as follows:

 Outlook Temperature Humidity Windy Class: Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No

Where  Outlook, Temperature, Humidity, and Windy are the input variables (predictors), and Play is the output variable (response).

a.    Compute the prior probability

P(PLAY=’Yes’) =

P(PLAY=’No’) =

b.   Compute the conditional probability

P(Outlook=’Sunny’|PLAY=’Yes’) =

P(Outlook=’Sunny’|PLAY=’No’) =

P(Temperature = ‘Mild’|PLAY=’Yes’) =

P(Temperature = ‘Mild’|PLAY=’No’) =

P(Humidity = ‘High’| PLAY=’Yes’) =

P(Humidity = ‘High’| PLAY=’No’) =

P(Windy = ‘False’| PLAY=’Yes’) =

P(Windy = ‘False’| PLAY=’No’)=

c.    Using naïve Bayes classification method to classify the following unknown record and to indicate whether to play or not.

(Outlook = ‘Sunny’,  Temperature = ‘Mild’ , Humidity = ‘High’ ,  Windy = ‘False’)

(20 points)

D. Association Rule Mining: (20 points)

Given a transaction database for mining association rule as follows:

Database D

 TID Items 100 A C D 200 B C E 300 A B C E 400 B E

Please useApriorialgorithm to mine association rules with minimum support count = 2.

(Please show the derivation process step by step with candidate itemsets.)

#### We are the Best!

##### 275 words per page

You essay will be 275 words per page. Tell your writer how many words you need, or the pages.

##### 12 pt Times New Roman

Unless otherwise stated, we use 12pt Arial/Times New Roman as the font for your paper.

##### Double line spacing

Your essay will have double spaced text. View our sample essays.

##### Any citation style

APA, MLA, Chicago/Turabian, Harvard, our writers are experts at formatting.

Secure Payment

# B. A dataset has 1000 records and 50 variables with 5% of the values missing, spread randomly throughout the records and variables. An analysis decides to remove records that have missing values. About how many records would you expect would be removed?

B.   A dataset has 1000 records and 50 variables with 5% of the values missing, spread randomly throughout the records and variables. An analysis decides to remove records that have missing values. About how many records would you expect would be removed? (20 points)

C. Given a database table containing weather data as follows:

 Outlook Temperature Humidity Windy Class: Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No

Where  Outlook, Temperature, Humidity, and Windy are the input variables (predictors), and Play is the output variable (response).

a.    Compute the prior probability

P(PLAY=’Yes’) =

P(PLAY=’No’) =

b.   Compute the conditional probability

P(Outlook=’Sunny’|PLAY=’Yes’) =

P(Outlook=’Sunny’|PLAY=’No’) =

P(Temperature = ‘Mild’|PLAY=’Yes’) =

P(Temperature = ‘Mild’|PLAY=’No’) =

P(Humidity = ‘High’| PLAY=’Yes’) =

P(Humidity = ‘High’| PLAY=’No’) =

P(Windy = ‘False’| PLAY=’Yes’) =

P(Windy = ‘False’| PLAY=’No’)=

c.    Using naïve Bayes classification method to classify the following unknown record and to indicate whether to play or not.

(Outlook = ‘Sunny’,  Temperature = ‘Mild’ , Humidity = ‘High’ ,  Windy = ‘False’)

(20 points)

D. Association Rule Mining: (20 points)

Given a transaction database for mining association rule as follows:

Database D

 TID Items 100 A C D 200 B C E 300 A B C E 400 B E

Please useApriorialgorithm to mine association rules with minimum support count = 2.

(Please show the derivation process step by step with candidate itemsets.)

#### We are the Best!

##### 275 words per page

You essay will be 275 words per page. Tell your writer how many words you need, or the pages.

##### 12 pt Times New Roman

Unless otherwise stated, we use 12pt Arial/Times New Roman as the font for your paper.

##### Double line spacing

Your essay will have double spaced text. View our sample essays.

##### Any citation style

APA, MLA, Chicago/Turabian, Harvard, our writers are experts at formatting.

Secure Payment