Purpose of project
Over the years at Queen’s Royal College I have seen teachers having stern conversations with students for reaching to school late habitually. These students are faced with consequences such as: “in-house suspension” or community service for regular late coming. I myself have been a victim of these punishments. It is believed that students who are frequently late are indiscipline, and this can spill over into their study habits, hence affecting their overall performance in their internal examinations. On the other hand, some share different views that punctuality has no effect on a student’s performance.
Reason being, students do extra studies at home, hence making up for lost time at school. In that context I would like to determine through a statistical study whether or not there is a correlation between student’s punctuality and academic performance in Queen’s Royal College. I chose to study the present fifth form year group because this is the year they will be heading into the CXC CSEC examinations, assuming at this point, their attitude towards their school work will be serious.
Let X be the total sum of form 5 students’ number of times late Let Y be the total sum of form 5 students’ average end of term examination scores
seldom number of times late
excessive number of times late
30% – 49% bad average score
50% – 69% good average score
70% – 89% excellent average score
n is the number of students in the sample size
Let ∑x represent the sum of all the times late by the form 5 students Let ∑y represent the sum of all the form 5 students’ end of term exam average scores Let ∑xy represent the sum of the multiples of form 5 number of times late and form 5 end of term exam average scores Let∑x2 represent the sum of the squares of the form 5 students’ number of times late Let ∑y2 represent the sum of the squares of the form 5 end of term exam average scores Let represent the sample mean of X
Let represent the sample mean of Y
Method of data collection
1. I collected a copy of the roll books for the various form 5 classes for term 1 (September to December) from the various form teachers. 2. The average score for each student was obtained from the dean of the form 5 year. 3. I counted the number of times late for each student and totaled it. 4. Then I sampled the data. To do this, I used systematic random sampling, I used the lottery method. I wrote each of the student numbers for a particular class (R first) on a separate little piece of paper and put them all into a hat.
Then I picked out 10, one at a time without replacement, and for each one I chose, I wrote down the number of times late and the corresponding average score. 5. I repeated this for the classes O, Y and L. So in the end I had a sample size of 40, 10 from each class. 6. Afterwards I organized the data, making lists of the student number and their corresponding number of times late and average end of term exam scores for from 5 classes R, O, Y and L and put it into a table.
Presentation of data
Fig 1.1 is a table showing n of forty form 5 students chosen and their corresponding punctuality and average score obtained at the end of the term. Of the forty students chosen twenty-five were seldom late and fifteen students were excessively late. It also shows that, eight did bad in the end of term exam, twenty-one did good and eleven did excellent. Of the eight that did bad 2 were excessively late and 6 were seldom late. Of the twenty-one that did good 12 were excessively late and 9 were seldom late. Of the eleven that did excellent 1 was excessively late and 10 were seldom late.
Fig 1.2 is a bar graph showing performance level attributed to students who were seldom late and excessively late. Of the eight that performed badly [see Table 1.1], 75% were seldom late and 25% excessively late. Of the twenty-one that performed good 43% were seldom late and 57% were excessively late. Of the eleven that did excellent 91% were seldom late and 9% were excessively late.
Fig 1.3 is a scatter plot showing form 5 students’ average end of term score in relation to the number of times they were late.
Analysis of data
Chi-square test of independence
A χ2-test of independence at the 5% level of significance will be used to determine whether the form 5 students number of times late and average end of term scores are independent of each other, or if there is a relationship between them. H0 represents the null hypothesis H1 represents the alternative hypothesis O represents observed frequencies E represents expected frequencies α represents the level of significance v represents the number of degrees of freedom H0: A student’s form 5 end of term average score is independent of his number of times late. H1: A student’s form 5 end of term average score is dependent on his number of times late.
In Fig 1.4, from the points a regression line was drawn which passes through the mean of both sets of data, . The line shows y tends to decrease extremely gentltly as x, increases. Also, the points are scattered about the regression line. This shows that there is a very weak negative correlation between X and Y.
Discussion of findings
My purpose was to investigate the relationship between student’s punctuality (X) and academic performance (Y) in a form 5 year group in Queen’s Royal College. After I collected my data and sampled it, I put it into a table (Fig 1.1), and then decided to put it in a scatter plot (Fig 1.3) and a bar graph (Fig 1.2). This made the relationship between X and Y easily identifiable. It was also now easier to compare them both. After appropriately representing my data, I chose to do a Chi-square test of independence. This was to determine whether X and Y are independent of each other or not. My decision, at the 5% significance level was to reject the alternative hypothesis, meaning that X and Y are not dependent of each other, and so a student’s form 5 end of term average exam score does not depend on his punctuality record. However, that was not the case and the Chi-square test proved that X and Y are dependent of each other.
After determining that X and Y are dependent on each other in the Chi-square test another test was carried out. Details of the relationship were necessary, and so r, the linear product moment correlation coefficient, and the equation of the regression line were calculated. The linear product moment correlation coefficient goes from 1 to -1 and indicates the strength of the linear correlation between two variables. In this experiment, r was found to be -0.141. This value is negative and very low i.e. near to 0, indicating that there is a very weak negative linear correlation between X and Y. Therefore, from this test, it is safe to say that there is no relationship between X and Y. r also indicates the strength of the least squares regression line that was found.
A least squares regression line of Y on X minimizes the sum of the square of the y differences, therefore it is the most accurate representation of the data in the scatter plot, and i.e. the best fit line. The equation of this line was found to be y = 62.12 + -0.2x, and the point ( lies on this line, this was demonstrated on the second scatter plot (Fig 1.4). Since r is very low, this regression line is very weak, and therefore the predictions made from it will be inaccurate. The value of b, -0.2 represents the amount by which y decreases for every unit increase in x, i.e. the number of additional marks in form 5 end of term exams that a student will lose for every additional number of times they were late. The value of a, 62.12, would represent the score a student would get in form 5 end of term exams if he is late 0 times for the term.
This sample was only taken from one year group, and so it does not necessarily accurately represent future year groups. This test was done using only scores from one specific examination, there may be errors due to this because students may not have performed at their usual abilities for various reasons, such as an illness or a family problem and also students’ varying choice of subjects in that some may be doing relatively easier subjects than others and some may be doing less subjects than others. While collecting my data I observed that it had a lot of students who were absent. Therefore, besides punctuality, absenteeism could have affected their end of term average scores.
In this study, one test proved that X and Y were dependent of each other while the other test proved that there was no correlation between them. Therefore no clear cut conclusion can be made as to whether or not a student’s academic performance depends on their punctuality record in Queen’s Royal College. This study however, can be improved by collecting data from a larger sample to increase accuracy of data and carrying out the test for different year groups.
J. Crashaw & J. Chambers, A Concise Course In Advanced Level Statistics, Nelson Thornes Ltd, 2002 H. Mulholland & J.H.G. Phillips, Applied Mathematics for Advanced Level, Butterworths 1969 http://archive.bio.ed.ac.uk/jdeacon/statistics/tress9.html