DTA621S - DATA ANALYTICS - 2ND OPPSUPL - JAN 2023


DTA621S - DATA ANALYTICS - 2ND OPPSUPL - JAN 2023



1 Pages 1-10

▲back to top


1.1 Page 1

▲back to top


n Am I BI A u n I VE RS ITY
OF SCIEnCE Ano TECHnOLOGY
FACULTYOF COMPUTING AND INFORMATICS
DEPARTMENTOF INFORMATICS
QUALIFICATION: BACHELOROF INFORMATICS, BACHELOROF COMPUTER SCIENCE
QUALIFICATIONCODE: 07BAIT, 07BCMS
LEVEL: 6
COURSE: DATA ANALYTICS
COURSECODE: DTA621S
DATE:JANUARY 2023
SESSION: 1
DURATION: 3 HOURS
MARKS: 100
SECONDOPPORTUNITY/SUPPLEMENTARYEXAMINATION QUESTION PAPER
EXAMINER(S) MRS RUUSA IPINGE
MODERATOR: Dr. JACOB ONGALA
THIS QUESTION PAPERCONSISTSOF 8 PAGES
(Including this front page)
INSTRUCTIONS
• Answer ALL questions in Questionl, Question2, Question3, Question 4 and Question 5
• NUST examinations rules apply
• DO NOT open this examination cover until you are instructed to do so.
• DO NOT FORGET to write down your student number at the designated places in the
examination page.
1

1.2 Page 2

▲back to top


1.3 Page 3

▲back to top


Questionl: MULTIPLE QUESTIONS {20 MARKS MAXIMUM 1 MARK FOR EACH CORRECT
ANSWER)
Answer all questions. Select ONLY ONE BESTASWERto each questions.
1. The 'right to be forgotten' gives an individual the right to have their personal data,
what?
a) Read by everyone
b) Amended
c) Erased
d) Restricted
2. Point out the correct statement
a) Raw data is the data in excel sheet
b) Raw data is the data obtained after pre-processing
c) Raw data is the data before pre-processing
d) None of the above
3. Are used when we want to visually examine the relationship between two
quantitative variables.
a. Bar graph
b. Scatterplot
c. Line graph
d. Pie chart
4. A graph that usesvertical bars to represent data is called a __ .
a) Bar graph
b) Line graph
c) Scatterplot
2

1.4 Page 4

▲back to top


1.5 Page 5

▲back to top


d) All of the mentioned above
5. Data Analysis is a process of,
a) Inspecting data
b) Data Cleaning
c) Transforming of data
d) All of the mentioned above
6. What are the only grounds for individuals to have an absolute right to objection?
a) Legitimate interest
b) Public Interest
c) Direct Marketing
d) None of the above
7. The students divided into different groups according to their intelligence and gender
will generate".
a) Quantitative
b) Qualitative
c) Continuous data
d) Constant
8. The mode of 2,9 and 7 is:
a) No mode exists.
b) 2
c) 9
d) 6
9. The data must be arranged before computing the
a) Mean
b) Mode
c) Median
3

1.6 Page 6

▲back to top


1.7 Page 7

▲back to top


d) All of the above
10. To predict a binary value. use_
a) Logistic
b) Clustering
c) Classification
d) Dimensionality reduction
11. How do machine learning algorithms make more precise predictions?
a) The algorithms are typically run more powerful servers.
b) The algorithms are better at seeing patterns in the data.
c) Machine learning servers can host larger databases.
d) The algorithms can run on unstructured data
12. Machine learning is a subset of?
a) Artificial Intelligence
b) Deep Learning
c) Artificial intelliget
d) None of the above
13. In linear regression, we try to
the line of the best fit.
a) Minimise
b) Maximise
c) Change
d) None of the above
the least square errors of the model to identify
4

1.8 Page 8

▲back to top


1.9 Page 9

▲back to top


14. What is the data type of print (type)(( "Loide")?
a) Float,
b) String,
c) Integer
d) Boolean
15. Under the GDPR, how many main rights do individuals have?
a) 6
b) 8
c) 9
d) 7
16. Within how long must a data breach be reported from when it is first identified?
a) 24
b) 48
c) 12
d) 72
17. This is the process of programmatically identifying and grouping objects or ideas
into pre-determined categories
a) Classification
b) Binary
c) Sorting
d) Variance
18. You want to identify global weather patterns that may have been affected by
climate change. To do so, you want to use machine learning algorithms to find
5

1.10 Page 10

▲back to top


2 Pages 11-20

▲back to top


2.1 Page 11

▲back to top


patterns that would otherwise be imperceptible to a human meteorologist. What is
the place to start?
a) Find labeled data of sunny days so that the machine will learn to identify bad
weather.
b) Use unsupervised learning have the machine look for anomalies in a massive
weather database.
c) Create a training set of unusual patterns and ask the machine learning
algorithms to classify them.
d) Create a training set of normal weather and have the machine look for similar
patterns.
19. Why is naive Bayes called naive?
a) It naively assumes that you will have no data.
b) It does not even try to create accurate predictions.
c) It naively assumes that the predictors are independent from one another.
d) It naively assumes that all the predictors depend on one another.
20. Which statistical model and supervised machine learning algorithm uses
independent variables to predict the values of a dependent variable?
a) Lasso regression.
b) Multiple Regression.
c) Logistic Regression
d) Linear regression.
6

2.2 Page 12

▲back to top


2.3 Page 13

▲back to top


Question 2
[8 Marks]
Look at the following boxplots about Emma's performance in physics and chemistry and answer
the following questions.
---rn1--- Physics
---------1[]]-
Che1nistry
0 10 20 30 40 50 60 70 80 90 100 Scale
(a) State the median mark for each subject.
(2)
(b) Find the range of marks in each subject.
(2)
(c) Calculate the interquartile for Chemistry?
(3)
(d) Give one of the benefits of using a scatter plot dring?
(1)
Question 3
a) List and describe 4 different types of data distribution
b) List and explain the Methods of Data Integration.
c) Explain three methods of how you can clean noise data
d) Using Examples, explain the process of Data Science
[32 Marks]
(8)
(8)
(6)
(10)
Question 4
[20 Marks]
a) Explains under fitting, and how can you avoid It?
(6)
b) Explain the difference between the data controller and data processor in GDPR? (4)
c) Describe the 4 methods of how you could evaluate the performance of machine
learning (4)
7

2.4 Page 14

▲back to top


2.5 Page 15

▲back to top


d) Briefly Explain Neural Network.
(2)
e) Explain the Support vector machine learning Algorithm (SVM).
(2)
Question 5
[20 Marks]
4.lBased on the Jupiter notebook, explain what the following command means
a) df.dtypes
(2)
b) df.head
(2)
c) df.drop duplicates()
(2)
d) df.nsmallest(n, 'values')
(2)
e) df.fillna('values ')
(2)
4.2 Write a command in Jupiter notebook that will allow you to perform the following tasks
a) Give the statistics of each columns
(2)
b) df.iat (1, 2)
(2)
c) Give the minimum value of each object
(2)
d) To group object by columns
(2)
e) Calculate the standard deviation of each object
(2)
THE END OF EXAM
8

2.6 Page 16

▲back to top


I Pfila~~:Bas
I V-".n; t.,hoe:i:=5:
iJ.M.."IB!A
i