DSA821S - DATA SCIENCE AND ANALYTICS - 1ST OPP - NOV 2022


DSA821S - DATA SCIENCE AND ANALYTICS - 1ST OPP - NOV 2022



1 Page 1

▲back to top


nAmlBIA UnlVERSITY
OF SCIEn CE Ano TECHn OLOGY
FACULTY OF COMPUTING AND INFORMATICS
DEPARTMENT OF INFORMATICS
QUALIFICATION: Bachelor of Informatics Honours (with specialisations in Web Informatics and
Business Informatics)
QUALIFICATION CODE: 0SBIFH/0SBIHB
COURSELEVEL: NQF LEVEL8
COURSE:Data Science and Analytics
COURSECODE: DSA821S
DATE: NOVEMBER 2022
SESSION: 1
DURATION: 2 Hours
MARKS: 60
EXAMINER(S):
FIRSTOPPORTUNITY EXAMINATION QUESTION PAPER
DR LAMECK MBANGULA AMUGONGO
MODERATOR (S):
MS EMILIA SHIKEENGA
THIS EXAMINATION PAPERCONSISTSOF 5 PAGES
(INCLUDING THIS FRONT PAGE)
Instructions for the students
1. Answer ALL the questions.
2. Write clearly and neatly.
3. Number the answers clearly.

2 Page 2

▲back to top


3 Page 3

▲back to top


Question 1: Short questions
[18]
a) True or False: When performing unsupervised learning we know the number of clusters
beforehand.
[1]
b) True or False: Big data is initially characterized by 3 Vs: Volume, Veracity and Variety? [1]
c) True or False: Data Science is the same as Data analytics.
[1]
d) True or False: Business can utilise insights from data to maintain competitive advantage. [1]
e) True or False: The below Figure is an example of Inferential statistics?
[1]
Dail new COVID-19 cases in Namibia from 1 ul - 23 Dec 2021
1750
7day_rolling_avg
1500
._ Daily new cases
>u .1250
a:1i 000
:,
CT 750
LL 500
'--
250
0
~ltfib;;.ILU,Jr,mp.~,_,,_., .,,.-.....-.-..-..-~_ _,.1!,....111~
f) True or False: If two variables X and Y are correlated, then we must be able to specify the
cause i.e, Xis the cause or Y is the cause.
[1]
g) True or False: In a classification problem statement after training followed by testing, we
get an accuracy of 99. 7%, we can necessarily conclude that it is a good model.
[1]
h) Which of the following are correct about Activation Functions in neural network?
[2]
a. Derivative of a sigmoid activation function g(z) is g(z)[l-g(z)]
b. Derivative of a hyperbolic tangent activation k(z) function is 1-(k(z))2
c. Derivative of a leaky RELUActivation function h(z) is 1
d. Derivative of RELUactivation function l(z) is Ofor z<O
i) Choose the correct option for residuals in Linear regression?
[1]
a. Residuals are horizontal offset, and the sum of residuals varies between [0,1]
b. Residuals are horizontal offset, and the sum of residuals can be unity.
c. Residuals are vertical offset, and the sum of residuals is always unity.
d. Residuals are vertical offset, and the sum of residuals is always zero.
j) Which of the following are correct related to the Confusion Matrix?
[2]
2

4 Page 4

▲back to top


5 Page 5

▲back to top


a. Confusion matrix is always a square matrix
b. Confusion matrix is a way to judge our classification model
c. Diagonal entries in a confusion matrix may be zero or non-zero
d. Confusion matrix is a symmetric matrix
k) Which of the following statements are correct for Support Vector Machines (SVM)? [1]
a. A support vector machine is a machine learning algorithm that analyses data for
both classification and regression analysis.
b. SVM is an unsupervised learning method.
c. An SVM finds the hyperplane which is having the largest margin value.
d. SVMs are used in text categorization, image classification recognition, etc.
I) Which is not a deep learning method:
[1]
a. Learning rate Decay.
b. Dropout.
c. Training from scratch.
d. Bootstrapping.
e. Transfer Learning.
m) If we have a date column in your dataset, then how will you perform Feature Engineering
using Python? Hint: A date column, has lots of important features such as: day of the week,
day of the month, day of the quarter, and day of the year etc.
[4]
3

6 Page 6

▲back to top


7 Page 7

▲back to top


Question 2: Apriori algorithm
[16)
A table has five transactions. Let the minimum support (min sup)= 60% and min confidence (conf) =
80%.
ltemlD
FlOO
F101
F102
F103
F104
ltems_bought
{Bread, Egg, Milk, Butter, Honey, Sugar}
{Cereal, Egg, Milk, Butter, Honey, Sugar}
{Bread, Bacon, Butter, Honey}
{Bread, Jam, Cookie, Butter, Sugar}
{Cookie, Egg, Egg, Butter, Cucumber, Honey}
a) Find all frequent item sets using Apriori algorithm.
[10]
b) List all the strong association rules (with support and confidence).
[6]
Question 3: Classification
[16)
1. The table below illustrates the prediction for a model to predict Bankruptcy. Based the test
set, calculate the evaluation measures.
No Target
Prediction No Target
Prediction No Target
Prediction
1 Bankruptcy Bankruptcy 8 No
No
15 Bankruptcy Bankruptcy
Bankruptcy Bankruptcy
2 Bankruptcy Bankruptcy 9 Bankruptcy Bankruptcy 16 No
No
Bankruptcy Bankruptcy
3 Bankruptcy Bankruptcy 10 Bankruptcy Bankruptcy 17 Bankruptcy No
Bankruptcy
4 Bankruptcy Bankruptcy 11 Bankruptcy Bankruptcy 18 No
Bankruptcy
Bankruptcy
5 No
No
12 No
No
19 No
No
Bankruptcy Bankruptcy
Bankruptcy Bankruptcy
Bankruptcy Bankruptcy
6 Bankruptcy Bankruptcy 13 Bankruptcy Bankruptcy 20 No
No
Bankruptcy Bankruptcy
7 No
No
14 No
No
21 No
No
Bankruptcy Bankruptcy
Bankruptcy Bankruptcy
Bankruptcy Bankruptcy
4

8 Page 8

▲back to top


9 Page 9

▲back to top


a) Complete the confusion matrix.
b) Compute the misclassification rate.
c) Compute Fl-measure
2. Consider the following 3-class confusion matrix:
Predicted
Actual
A
B
A
25
5
B
3
32
C
1
0
a) What is the overall accuracy?
b) What can you say about Recall and Sensitivity?
c) What is the precision for class A?
d) What is the specificity of class C?
[4]
[2]
[4]
C
2
4
15
[2]
[2]
[1]
[1]
Question 4: Linear Optimisation
[10]
Pick n Pay Oshakati during the festive season combines two products rice and potato to form a gift
pack which must weigh 5 kg. At least 2 kg of rice and not more than 4 kg of potato should be used.
The net profit contribution to the Pick n Pay is Namibian dollars 5 per kg for Rice and N$ 6 per kg for
potato. Formulate LP Model to find the optimal factor mix.
a) Formulate the objective function.
[3]
b) Formulate Constraints.
[3]
c) Non-negative constraints.
[1]
d) Summarise the optimization problem.
[3]
ENDOF EXAM
5

10 Page 10

▲back to top