**QUESTION 1**

- The method that predicts a value of a given continuous valued variable based on the values of other variables, assuming a linear or nonlinear model of dependency is :a.Cohesionb.Separationc.Regressiond.Correlation

**5 points **

**QUESTION 2**

Using tables given, calculate following calculate the Cost of each Model. Which one is true?**(10 points)**a.M2 has higher COSTb.They have the same COSTc.M1 has higher COSTd.We need more information

**5 points **

**QUESTION 3**

- Assume, two attributes have a correlation of 0.02; what does this tell you about the relationship of the two attributes?a.When one decreases the other one increasesb.They have strong correlaionc.They are highly correlatedd.They have weak positive correlation

**5 points **

**QUESTION 4**

- The problem that arises when you fit your model based on your training data is calleda.False Positive (FP)b.False Negative (FN)c.Overfittingd.Underfitting

**5 points **

**QUESTION 5**

- The accuracy of a Classification Model is calculated using: a.Entire Datasetb.None of thesec.Testing setd.Training Set

**5 points **

**QUESTION 6**

- Given two models of similar generalization errors, one should prefer the simpler model over the more complex model, is the definition ofa.Occam’s Razorb.Simple Model Theoryc.Basic Model Principled.Accuracy Models

**5 points **

**QUESTION 7**

- Which one is the most common measure to evaluate K-Means Clustersa.Cohesionb.Separationc.Cluster Meand.SSE

**5 points **

**QUESTION 8**

- The method where you reserve 2/3 for training and 1/3 for testing isa.Cross Validationb.Stratified Trainingc.Bootstrapd.Holdout

**5 points **

**QUESTION 9**

- ___________________ measures how closely related are objects in a clustera.Cluster Mean b.Cluster Separationc.Cluster Centroid d.Cluster Cohesion

**5 points **

**QUESTION 10**

- K-means isa.Centroid-based Hierarchical clusteringb.Medoid-based Partitional clustering approachc.Medoid-based Hierarchical clusteringd.Centroid -based Partitional clustering approach

**5 points **

**QUESTION 11**

- For the tree given below:

What is the training error (optimistic error) for the parent:a.12/36b.5/36c.10/36d.24/36

**5 points **

**QUESTION 12**

- For the tree given below:

What is the training error (optimistic error) for the children:a.5/36b.6/36c.24/36d.12/36

**5 points **

**QUESTION 13**

- For the tree given below:

What is the training pessimistic error for the children (N=0.5):a.8/36b.6/36c.12/36d.26/36

**10 points **

**QUESTION 14**

- K-means is :a.Centroid -based Partitional clustering approachb.Medoid-based Partitional clustering approachc.Medoid-based Hierarchical clusteringd.Centroid-based Hierarchical clustering

**5 points **

**QUESTION 15**

- ___________________measures how closely related are objects in a cluster

___________________measures how distinct or well-separated a cluster is from other clustersa.Cluster Cohesion – Cluster Separationb.Cluster Separation – Cluster Cohesionc.Cluster Similarity – Cluster Distanced.Cluster Distance – Cluster Similarity

**5 points **

**QUESTION 16**

- Consider the training examples shown in below Table for a binary classification problem
**(10 points)**

Calculate the Gini value for the following attributes and answer the following questions:

Which attribute is better, Gender, Car Type, or Shirt Size for the 1st split ? a.It does not matter. Algorithm chooses one randomly. b.Shirt Sizec.Genderd.Car Type

**10 points **

**QUESTION 17**

- Consider the training examples shown in below Table for a binary classification problem
**(10 points)**

Calculate the Gini value for the following attributes and answer the following questions: Gender, Car Type, Shirt Size

**What is the right order of the Gini Values from the lowest to highest?**a.Gender < Shirt Size < Car Type b.Gender < Car Type < Shirt Size

c.Car Type < Gender < Shirt Size d.Shirt Size < Car Type < Gender