Header Ads Widget

Learning with Regression and Trees


1. In practice, Line of best fit or regression line is found when _____________
a) Sum of residuals (∑(Y – h(X))) is minimum
b) Sum of the absolute value of residuals (∑|Y-h(X)|) is maximum
c) Sum of the square of residuals ( ∑ (Y-h(X))2) is minimum
d) Sum of the square of residuals ( ∑ (Y-h(X))2) is maximum
Answer: c
Explanation: Here we penalize higher error value much more as compared to the smaller one, such that there is a significant difference between making big errors and small errors, which makes it easy to differentiate and select the best fit line.

2. If Linear regression model perfectly first i.e., train error is zero, then _____________________
a) Test error is also always zero
b) Test error is non zero
c) Couldn’t comment on Test error
d) Test error is equal to Train error
Answer: c
Explanation: Test Error depends on the test data. If the Test data is an exact representation of train data then test error is always zero. But this may not be the case.

3. Which of the following metrics can be used for evaluating regression models?
i) R Squared ii) Adjusted R Squared iii) F Statistics iv) RMSE / MSE / MAE
a) ii and iv
b) i and ii
c) ii, iii and iv
d) i, ii, iii and iv
Answer: d
Explanation: These (R Squared, Adjusted R Squared, F Statistics, RMSE / MSE / MAE) are some metrics which you can use to evaluate your regression model.

4. How many coefficients do you need to estimate in a simple linear regression model (One independent variable)?
a) 1
b) 2
c) 3
d) 4
Answer: b
Explanation: In simple linear regression, there is one independent variable so 2 coefficients (Y=a+bx+error).

5. In a simple linear regression model (One independent variable), If we change the input variable by 1 unit. How much output variable will change?
a) by 1
b) no change
c) by intercept
d) by its slope
Answer: d
Explanation: For linear regression Y=a+bx+error. If neglect error then Y=a+bx. If x increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.

6. Function used for linear regression in R is __________
a) lm(formula, data)
b) lr(formula, data)
c) lrm(formula, data)
d) regression.linear(formula, data)
Answer: a
Explanation: lm(formula, data) refers to a linear model in which formula is the object of the class “formula”, representing the relation between variables. Now this formula is on applied on the data to create a relationship model.

7. In syntax of linear model lm(formula,data,..), data refers to ______
a) Matrix
b) Vector
c) Array
d) List
Answer: b
Explanation: Formula is just a symbol to show the relationship and is applied on data which is a vector. In General, data.frame are used for data.

8. In the mathematical Equation of Linear Regression Y = β1 + β2X + ϵ, (β1, β2) refers to __________
a) (X-intercept, Slope)
b) (Slope, X-Intercept)
c) (Y-Intercept, Slope)
d) (slope, Y-Intercept)
Answer: c
Explanation: Y-intercept is β1 and X-intercept is – (β1 / β2). Intercepts are defined for axis and formed when the coordinates are on the axis.

9) Looking at above two characteristics, which of the following option is the correct for Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 behavior is unknown
A) Pearson correlation will be close to 1
B) Pearson correlation will be close to -1
C) Pearson correlation will be close to 0
D) None of these
answer: (D)

10) Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right to conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
answer: (B)

11) Which of the following offsets, do we use in linear regression’s least square line fit? Suppose horizontal axis is independent variable and vertical axis is dependent variable.
q11

A) Vertical offset
B) Perpendicular offset
C) Both, depending on the situation
D) None of above
answer: (A)

12) True- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
answer: (B)

13) We can also compute the coefficient of linear regression with the help of an analytical method called “Normal Equation”. Which of the following is/are true about Normal Equation?
We don’t have to choose the learning rate
It becomes slow when number of features is very large
Thers is no need to iterate
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
answer: (D)

14) Which of the following statement is true about sum of residuals of A and B?
Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find the sum of residuals in both cases A and B.
Note:
Scale is same in both graphs for both axis.
X axis is independent variable and Y-axis is dependent variable.
q14
A) A has higher sum of residuals than B
B) A has lower sum of residual than B
C) Both have same sum of residuals
D) None of these
answer: (C)

15) Choose the option which describes bias in best manner.
A) In case of very large x; bias is low
B) In case of very large x; bias is high
C) We can’t say about bias
D) None of these
answer: (B)

16) What will happen when you apply very large penalty?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will approach zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
answer: (B)

17) What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
answer: (A)

18) Which of the following statement is true about outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is not sensitive to outliers
C) Can’t say
D) None of these
answer: (A)

19) Suppose you plotted a scatter plot between the residuals and predicted values in linear regression and you found that there is a relationship between them. Which of the following conclusion do you make about this situation?
A) Since the there is a relationship means our model is not good
B) Since the there is a relationship means our model is good
C) Can’t say
D) None of these
answer: (A)

20) What will happen when you fit degree 4 polynomial in linear regression?
A) There are high chances that degree 4 polynomial will over fit the data
B) There are high chances that degree 4 polynomial will under fit the data
C) Can’t say
D) None of these
answer: (A)

21) What will happen when you fit degree 2 polynomial in linear regression?
A) It is high chances that degree 2 polynomial will over fit the data
B) It is high chances that degree 2 polynomial will under fit the data
C) Can’t say
D) None of these
answer: (B)

22) In terms of bias and variance. Which of the following is true when you fit degree 2 polynomial?
A) Bias will be high, variance will be high
B) Bias will be low, variance will be high
C) Bias will be high, variance will be low
D) Bias will be low, variance will be low
answer: (C)

23) Suppose l1, l2 and l3 are the three learning rates for A,B,C respectively. Which of the following is true about l1,l2 and l3?
q32
A) l2 < l1 < l3
B) l1 > l2 > l3
C) l1 = l2 = l3
D) None of these
answer: (A)

24) Now we increase the training set size gradually. As the training set size increases, what do you expect will happen with the mean training error?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
answer: (D)

25) What do you expect will happen with bias and variance as you increase the size of training data?
A) Bias increases and Variance increases
B) Bias decreases and Variance increases
C) Bias decreases and Variance decreases
D) Bias increases and Variance decreases
E) Can’t Say False
answer: (D)

26) What would be the root mean square training error for this data if you run a Linear Regression model of the form (Y = A0+A1X)?
q32
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
answer: (C)

Question Context 27-28:
Suppose you have been given the following scenario for training and validation error for Linear Regression.

ScenarioLearning RateNumber of iterationsTraining ErrorValidation Error
10.11000100110
20.260090105
30.3400110110
40.4300120130
50.4250130150

27) Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
answer: (B)

28) Suppose you got the tuned hyper parameters from the previous question. Now, Imagine you want to add a variable in variable space such that this added feature is important. Which of the following thing would you observe in such case?
A) Training Error will decrease and Validation error will increase
B) Training Error will increase and Validation error will increase
C) Training Error will increase and Validation error will decrease
D) Training Error will decrease and Validation error will decrease
E) None of the above
answer: (D)

29) In such situation which of the following options would you consider?
Add more variables
Start introducing polynomial degree variables
Remove some variables
A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1, 2 and 3
answer: (A)

30) Now situation is same as written in previous question(under fitting).Which of following regularization algorithm would you prefer?
A) L1
B) L2
C) Any
D) None of these
answer: (D)

31) Which of the following evaluation metrics can not be applied in case of logistic regression output to compare with target?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
answer: D

32) One of the very good methods to analyze the performance of Logistic Regression is AIC, which is similar to R-Squared in Linear Regression. Which of the following is true about AIC?
A) We prefer a model with minimum AIC value
B) We prefer a model with maximum AIC value
C) Both but depend on the situation
D) None of these
answer: A

33) [True-False] Standardisation of features is required before training a Logistic Regression.
A) TRUE
B) FALSE
answer: B

34) Which of the following algorithms do we use for Variable Selection?
A) LASSO
B) Ridge
C) Both
D) None of these
answer: A

Context: 35-36
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by changing the parameters w.

35) What would be the range of p in such case?
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
answer: C

36) In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
answer: A

37) Suppose you have been given a fair coin and you want to find out the odds of getting heads. Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
answer: C

38) The logit function(given as l(x)) is the log of odds function. What could be the range of logit function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
answer: A

39) Which of the following option is true?
A) Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally distributed
answer: A

40) Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
Logit_inv(x): is a inverse logit function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
answer: B

41. A _________ is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility.
a) Decision tree
b) Graphs
c) Trees
d) Neural Networks
Answer: a
Explanation: Refer the definition of Decision tree.

42. Decision Tree is a display of an algorithm.
a) True
b) False
Answer: a
Explanation: None.

43. What is Decision Tree?
a) Flow-Chart
b) Structure in which internal node represents test on an attribute, each branch represents outcome of test and each leaf node represents class label
c) Flow-Chart & Structure in which internal node represents test on an attribute, each branch represents outcome of test and each leaf node represents class label
d) None of the mentioned
Answer: c
Explanation: Refer the definition of Decision tree.

44. Decision Trees can be used for Classification Tasks.
a) True
b) False
Answer: a
Explanation: None.

45. Choose from the following that are Decision Tree nodes?
a) Decision Nodes
b) End Nodes
c) Chance Nodes
d) All of the mentioned
Answer: d
Explanation: None.

46. Decision Nodes are represented by ____________
a) Disks
b) Squares
c) Circles
d) Triangles
Answer: b
Explanation: None.

47. Chance Nodes are represented by __________
a) Disks
b) Squares
c) Circles
d) Triangles
Answer: c
Explanation: None.

48. End Nodes are represented by __________
a) Disks
b) Squares
c) Circles
d) Triangles
Answer: d
Explanation: None.

49. Which of the following are the advantage/s of Decision Trees?
a) Possible Scenarios can be added
b) Use a white box model, If given result is provided by a model
c) Worst, best and expected values can be determined for different scenarios
d) All of the mentioned
Answer: d
Explanation: None.

Post a Comment

0 Comments