2.1Assuming that data mining techniques are to be used in the following cases, identify whether the task required is supervised or unsupervised learning. a.Supervised-Deciding whether to issue a loan to an applicant based on demographic and financial data (with reference to a database of similar data on prior customers). b.Unsupervised-In an online bookstore, making recommendations to customers concerning additional items to buy based on the buying patterns in prior transactions. c.Supervised-Identifying a network data packet as dangerous (virus, hacker attack) based on comparison to other packets whose threat status is known. d.Unsupervised-Identifying segments of similar customers.
e.Supervised-Predicting whether a company will go bankrupt based on comparing its financial data to those of similar bankrupt and non-bankrupt firms. f.Unsupervised-Estimating the repair time required for an aircraft based on a trouble ticket. g.Supervised-Automated sorting of mail by zip code scanning. h.Unsupervised-Printing of custom discount coupons at the conclusion of a grocery store checkout based on what you just bought and what others have bought previously.
2.3Consider the sample from a database of credit applicants in Figure 2.13. Comment on the likelihood that it was sampled randomly, and whether it is likely to be a useful sample. I don’t think that the sample was random because records are taken from 8th person. If the sample were to be random it would vary more. I don’t think that the sample would be useful either because of the type of variables that are being used.
2.5Using the concept of overfitting, explain why when a model is fit to training data, zero error with those data is not necessarily good. It’s not good because when looking at models you want to see the relationship between the data if there are zero error in the data then the information you get is skewed and may not be a true reflection. 2.7A dataset has 1000 records and 50 variables with 5% of the values missing, spread randomly throughout the records and variables. An analyst decides to remove records that have missing values. About how many records would you expect would be removed? Trick question – all because none of the records would be useable.
Sorry, but copying text is forbidden on this website. If you need this or any other sample, we can send it to you via email.
Please, specify your valid email address
Topic: Data mining techniques
We can't stand spam as much as you doNo, thank’s. I prefer suffering on my own.
Remember that this is just a sample essay and since it might not be original, we do not recommend to submit it. However, we might edit this sample to provide you with a plagiarism-free paperEdit this sample
Courtney from Study Moose
Hi there, would you like to get such a paper? How about receiving a customized one? Check it out https://goo.gl/3TYhaX