Canonical Tag Script

Thursday, November 30, 2023

‘Pearson Correlation’ | Where is it used | How is it interpreted |

QUESTION

What do you understand by ‘Pearson Correlation’? Where is it used and how is it interpreted?

CourseEducational Statistics

Course code 8614

Level: B.Ed Solved Assignment 

ANSWE

 The Pearson Correlation

The most commonly used correlation is the Pearson Correlation. It is also known as Pearson product-moment Correlation. It measures the degree and the direction of linear relationship of between two variables. It is denoted by r, and r = degree to which X and Y vary together / degree to which X and Y vary separately = co-variability of X and Y / variability of X and Y vary separately

To calculate the Pearson correlation r we use the formula



where SP is the sum of the product of deviation.

Two formulas (definitional and computational) are available to calculate the sum of square of product. Both formulas are given in the following box.

SS is sum of squares, SSx is the sum of squares of the variable X and SSy is the sum of squares of variable Y. In the following lines different formulas are given to calculate SSx and SSy. These formulas are categorized as definitional and computational. The definitional formulas for sum of squares of variable X are:

The computational formulas for sum of squares of variable X are

The definitional formulas for sum of squares of variable Y are:

The computational formulas for sum of squares of variable Y are:


It should be kept in mind that whichever formula one uses, it will yield similar result.
 

 Using and Interpreting Pearson Correlation


First let us have a brief discussion about where and why we use correlation. The 
discussion follows under following headings.

i)  Prediction

If two variables are known to be related in some systematic way, it is possible to use one variable to make prediction about the other. For example, when a student seeks admission in a college, he is required to submit a great deal of personal information, including his scores in SSC annual/supplementary examination. The college officials want this information so that they can predict that student’s chance of success in college. 

ii)  Validity


Suppose a researcher develops a new test for measuring intelligence. It is necessary that he should show that this new test valid and truly measures what it claims to measure. One common technique for demonstrating validity is to use correlation. 

If newly constructed test actually measures intelligence, then the scores on this test should be related to other already established measures of intelligence –  for example standardized IQ tests, performance on learning tasks, problem-solving ability, and so on. The newly constructed test can be correlated to each of these measures to demonstrate that the new test is valid. 

iii)  Reliability


Apart from determining validity, correlations are also used to determine reliability. A measurement procedure is reliable if it produces stable and consistent measurement. It means a reliable measurement procedure will produce the same (or nearly same) scores when the same individuals are measured under the same conditions. One common way to evaluate reliability is to use correlations to determine relationship between two sets of scores.


iv)  Theory Verification

Many psychological theories make specific predictions about the relationship between two variables. For example, a theory may predict a relationship between brain size and learning ability; between the parent IQ and the child IQ etc. In each case, the prediction of the theory could be tested by determining the correlation between two variables. 

Now let us have a few words on interpreting correlation. For interpreting correlation following consideration should be kept in mind.

i)  Correlation simply describes a relationship between two variables. It does not explain why two variables are related. That is why correlation cannot be interpreted as a proof of cause and effect relationship between two variables. 
ii)  The value of the correlation cannot be affected by range of scores represented in the data. 
iii)  One or two extreme data points, often called outliers, can have a dramatic effect on the value of the correlation.
iv)  When judging how good a relationship is, it is tempting to focus on the numerical value of the correlation. For example, a correlation of + 5 is halfway between 0 and 1.00 and therefore appears to represent a moderate degree of relationship. Here it should be noted that we cannot interpret correlation as a proportion. Although a correlation of 1.00 means that there is a 100% perfectly predictable relationship between variables X and Y; but a correlation of .5 does not mean that we can make 
a prediction with 50% accuracy. The appropriate process of describing how accurately one variable predicts the other is to square the correlation. Thus a correlation of r = .5 providesr2= .52= .25, 25% accuracy. (The value ris called coefficient of determination because it measures the proportion of variability in one variable that can be determined from the relationship with the other variable)







Wednesday, November 22, 2023

Process and Errors in Hypothesis Testing | Educational Statistics | 8614 |

QUESTION

Explain the process and errors in hypothesis testing. 

CourseEducational Statistics

Course code 8614

Level: B.Ed Solved Assignment 

ANSWE

Four-Step Process for Hypothesis Testing

The process of hypothesis testing goes through the following four steps.

i)  Stating the Hypothesis

The process of hypothesis testing begins by stating a hypothesis about tn. Usually, a researcher states two opposing hypotheses. Both hypotheses are stated in terms of population unknown population parameters.

The first and most important of the two hypotheses is called the null hypothesis. A null hypothesis states that the treatment has no effect. In general, the null hypothesis states that there is no change, no effect, no difference – nothing happened. The null hypothesis is denoted by the symbol Ho (H stands for hypothesis and 0 denotes that this is a zero effect).

The null hypothesis (Ho) states that in the general population, there is no change, no difference, or no relationship. In an experimental study, the null hypothesis (Ho) predicts that the independent variable (treatment) will have no effect on the dependent variable for the population.

The second hypothesis is simply the opposite of the null hypothesis and it is called the scientific or alternative hypothesis. It is denoted by H1. This hypothesis states that the treatment has an effect on the dependent variable.

The alternative hypothesis (H1) states that there is a change, a difference, or a relationship for the general population. In an experiment, H1 predicts that the independent variable (treatment) will have an effect on the dependent variable.

ii)  Setting Criteria for the Decision

In a common practice, a researcher uses the data from the sample to evaluate the authority of the null hypothesis. The data will either support or negate the null hypothesis. To formalize the decision process, a researcher will use the null hypothesis to predict exactly what kind of sample should be obtained if the treatment has no effect. In particular, a researcher will examine all the possible sample means that could be obtained if the null hypothesis is true.

iii)  Collecting data and computing sample statistics

The next step in hypothesis testing is to obtain the sample data. Then raw data are summarized with appropriate statistics such as mean, standard deviation, etc. then it is possible for the researcher to compare the sample mean with the null hypothesis.

iv)  Make a Decision

In the final step, the researcher decides, in the light of the analysis of data, whether to accept or reject the null hypothesis. If analysis of data supports the null hypothesis, he accepts it and vice versa

 

Uncertainty and Error in Hypothesis Testing

Hypothesis testing is an inferential process. It means that it uses limited information obtained from the sample to reach general conclusions about the population. As a sample is a small subset of the population, it provides only limited or incomplete information about the whole population. Yet hypothesis test uses information obtained from the sample. In this situation, there is always the probability of reaching an incorrect conclusion.

Generally, two kinds of errors can be made.

i)  Type I Errors

A type I error occurs when a researcher rejects a null hypothesis that is actually true. It means that the researcher concludes that the treatment does have an effect when in fact the treatment has no effect.

Type I error is not a stupid mistake in the sense that the researcher is overlooking something that should be perfectly obvious. He is looking at the data obtained from the sample that appear to show a clear treatment effect. The researcher then makes a careful decision based on available information. He never knows whether a hypothesis is true or false.

The consequences of a type I error can be very serious because the researcher has rejected the null hypothesis and believed that the treatment had a real effect. it is likely that the researcher will report or publish the research results. Other researchers may try to build theories or develop other experiments based on false results.

ii)  Type II Errors

A type II error occurs when a researcher fails to reject the null hypothesis that is really false. It means that a treatment effect really exists, but the hypothesis test has failed to detect it. This type of error occurs when the effect of the treatment is relatively small. That is the treatment does influence the sample but the magnitude of the effect is very small.

The consequences of Type II errors are not very serious. In case of Type II error, the research data do not show the results that the researcher had hoped to obtain. The researcher can accept this outcome and conclude that the treatment either has no effect or has a small effect that is not worth pursuing. Or the researcher can repeat the experiment with some improvement and try to demonstrate that the treatment does work. It is impossible to determine a single, exact probability value for a type II error.

Summarizing we can say that a hypothesis test always leads to one of two decisions.

i)  The sample data provides sufficient evidence to reject the null hypothesis and the researcher concludes that the treatment has an effect.

ii)  The sample data do not provide enough evidence to reject the null hypothesis. The researcher fails to reject the null hypothesis and concludes that the treatment does not appear to have an effect.

Tuesday, November 21, 2023

Procedure for Determining Median | Merits of Median | Demerits of Median | Educational Statistics | 8614 |

 

QUESTION

How do we calculate the median? Also, mention its merits and demerits.
CourseEducational Statistics

Course code 8614

Level: B.Ed Solved Assignment 

ANSWER

Median

The median is the middle value of rank order data. It divides the distribution into two halves (i.e. 50% of scores or observations on either side of the median value). It means that this value separates the higher half of the data set from the lower half. The goal of the median is to determine the precise midpoint of the distribution. The median is appropriate for describing ordinal data.

Procedure for Determining Median

When the number of scores is odd, simply arrange the scores in order (from lower to higher or from higher to lower). The median will be the middle score in the list. Consider the set of scores 2, 5, 7, 10, 12. The score “7” lies in the middle of the scores, so it is the median. When there is an even number of scores in the distribution, arrange the scores in order (from lower to higher or from higher to lower). The median will be the average of the middle two scores in the list. Consider the set of scores 4, 6, 9, 14 16, 20. The average of the middle two scores 11.5 (i.e. 9+14/2 = 23/2 = 11.5) is the median of the distribution.

The median is less affected by outliers and skewed data and is usually the preferred measure of central tendency when the distribution is not symmetrical. The median cannot be determined for categorical or nominal data.

Merits of Median

i)  It is rigidly defined.

ii)  It is easy to understand and calculate.

iii)  It is not affected by extreme values.

iv)  Even if the extreme values are not known median can be calculated.

v)  It can be located just by inspection in many cases.

vi)  It can be located graphically.

vii)  It is not much affected by sampling fluctuations.

viii)  It can be calculated by data based on an ordinal scale.

ix)  It is suitable for skewed distribution.

x)  It is easily located in individual and discrete classes.

Demerits of Median

i)  It is not based on all values of the given data.

ii)  For larger data sizes the arrangement of the data in increasing order is

a somewhat difficult process.

iii)  It is not capable of further mathematical treatment.

iv)  It is not sensitive to some change in the data value.

v)  It cannot be used for further mathematical processing.


Related Topics

ANOVA and its Logics

Median (Procedure of Determination, Merits, Demerits)

Measures of Dispersion

Descriptive and Inferential Statistics

What is data Cleaning? Importance and Benefits of Data Cleaning 

Explain the terms Degree of Freedom,Spread of Score,Sample,Z Score,Confidence Interval 

What is measure of difference? Explain different types of test

Concept of Reliability, Types and methods of Reliability

Level of Measurement

Types of Variable in Stats 

Measures of Central Tedency and Dispersion, 

Role of Normal Distribution, and also note on Skewness and Kurtosis. 

Methods of Effective Presentation

Monday, November 20, 2023

Measures of Dispersion | Educational Statistics | 8614 |

QUESTION

Explain different measures of dispersion used in educational research.

  • Course: Educational Statistics
  • Course code 8614
  • Level: B.Ed Solved Assignment

ANSWER

Introduction to Measures of Dispersion

Measures of central tendency focus on what is an average or in the middle of the distribution of scores. Often the information provided by these measures does not give us a clear picture of the data and we need something more. It means that knowing the mean, median, and mode of a distribution does allow us to differentiate between two or more than two distributions; and we need additional information about the distribution. This additional information is provided by a series of measures which are commonly known as measures of dispersion.

There is dispersion when there is dissimilarity among the data values. The greater the dissimilarity, the greater the degree of dispersion will be.

Measures of dispersion are needed for four basic purposes.

i)  To determine the reliability of an average.

ii)  To serve as a basis for the control of the variability.

iii)  To compare two or more series about their variability.

iv)  To facilitate the use of other statistical measures.

 

The measure of dispersion enables us to compare two or more series concerning their variability. It is also looked at as a means of determining uniformity or consistency. A high degree would mean little consistency or uniformity whereas a low degree of variation would mean greater uniformity or consistency among the data set. Commonly used measures of dispersion are range, quartile deviation, mean deviation, variance, and standard deviation.

Range

The range is the simplest measure of spread and is the difference between the highest and lowest scores in a data set. In other words, we can say that the range is the distance between the largest score and the smallest score in the distribution. We can calculate the range as:

Range = Highest value of the data – The lowest value of the data


For example, if the lowest and highest marks scored in a test are 22 and 95 respectively, then

Range = 95 – 22 = 73

The range is the easiest measure of dispersion and is useful when you wish to evaluate the whole of a dataset. However, it is not considered a good measure of dispersion as it does not utilize the other information related to the spread. The outliers, either extremely low or extremely high value, can considerably affect the range.

Quartiles

The values that divide the given set of data into four equal parts are called quartiles and are denoted by Q1, Q2, and Q3. Q1  is called the lower quartile and Q3 is called the upper quartile. 25% of scores are less than Q1and 75% scores are less than Q3. Q2 is the median. The formulas for the quartiles are:


Quartile Deviation (QD)

Quartile deviation or semi inter-quartile range is one-half the difference between the first and the third quartile, i.e.

Q D = Q3 – Q1

Where Q1 = the first quartile (lower quartile)

Q3 = third quartile (upper quartile)

Calculating quartile deviation from ungrouped date:

To calculate quartile deviation from ungrouped data, the following steps are used.

i)  Arrange the test scores from highest to lowest

ii)  Assign a serial number to each score. The first serial number is assigned to the lowest score.


Determine the first quartile (Q1) by using the formula



Use the obtained value to locate the serial number of the score that falls under Q1.

iv  Determine the third (Q3), by using the formula



Locate the serial number corresponding to the obtained answer. Opposite to this number is the test score corresponding to Q3.

v)  Subtract the Q1 from Q3, and divide the difference by 2.

Mean Deviation or Average Deviation

The mean or the average deviation is defined as the arithmetic mean of the deviations of the scores from the mean or the median. The deviations are taken as positive. Mathematically, 

Standard Deviation

Standard deviation is the most commonly used and the most important measure of variation. It determines whether the scores are generally near or far from the mean, i.e. are the scores clustered together or scattered. In simple words, standard deviation tells how tightly all the scores are clustered around the mean in a data set. When the scores are close to the mean, the standard deviation is small. And large standard deviation tells that the scores are spread apart. Standard deviation is simply the square root of variance, i.e.

 Variance



Related Topics


Thursday, November 16, 2023

Non-Probability Sampling | Educational Statistics |

 

QUESTION

Explain non-probability sampling techniques used in educational research. 
CourseEducational Statistics

Course code 8614

Level: B.Ed Solved Assignment 

ANSWER

Non-Probability Sampling

This technique depends on subjective judgment. It is a process where probabilities cannot  be assigned to the individuals objectively. It means that in this technique samples are gathered in a way that does not give all individuals in the population equal chances of being selected. Choosing these methods could result in biased data or a limited ability to make general inferences based on the findings. But there are also many situations in which choosing this kind of sampling technique is the best choice for a particular research question or the stage of research.

There are four kinds of non-probability sampling techniques.

i)  Convenience Sampling

In this technique a researcher relies on available subjects, such as stopping people in the markets or on street corners as they pass by. This method is extremely risky and does not allow the researcher to have any control over the representativeness of the sample. It is useful when the researcher wants to know the opinion of the masses on a current issue; or the characteristics of people passing by on streets at a certain point of time; or if time and resources are limited in such a way that the research would not be possible otherwise. What may be the reason for selecting convenience samples, it is not possible to use the results from a convenience sampling to generalize to a wider population.

ii)  Purposive or Judgmental Sampling

In this technique, a sample is selected on the basis of the knowledge of the population and the purpose of the study. For example, when an educational psychologist wants to study the emotional and psychological effects of corporal punishment, he will create a sample that will include only those students who ever had received corporal punishment.  In this case, the researcher used a purposive sample because those being selected fit a specific purpose or description that was necessary to conduct the research.

Snowball Sample

This type of sampling is appropriate when the members of the population are difficult to locate, such as homeless industry workers, undocumented immigrants, etc. A snowball sample is one in which the researcher collects data on a few members of the target population he or she can locate, then asks to locate those individuals to provide the information needed to locate other members of that population whom they know. For example, if a researcher wants to interview undocumented immigrants from Afghanistan, he might interview a few undocumented individuals he knows or can locate, and would then rely on those subjects to help locate more undocumented individuals. This process continues until the researcher has all the interviews he needs and all contacts have been exhausted. This technique is useful when studying a sensitive topic that people might not openly talk about, or if talking about the issue under investigation could jeopardize their safety.

iv)  Quota Sample

A quota sample is one in which units are selected into a sample on the basis of pre-specified characteristics so that the total sample has the same distribution of characteristics assumed to exist in the population. For example, if a researcher wants a national quota sample, he might need to know what proportion of the population is male and what proportion is female, as well as what proportion of each gender fall into different age category and educational category. The researcher would then collect a sample with the same proportion as the national population.

Saturday, November 11, 2023

Descriptive and Inferential Statistics | Educational Statistics |

 QUESTION

How do descriptive and inferential statistics help a teacher? Explain. 

CourseEducational Statistics
Course code 8614
Level: B.Ed Solved Assignment 

ANSWE

Descriptive and Inferential Statistics 

Researchers use a variety of statistical procedures to organize and interpret data. These procedures can be classified into two categories – Descriptive Statistics and Inferential Statistics. The starting point for dealing with a collection of data is to organize, display, and summarize it effectively. It is the major objective of descriptive statistics.

Descriptive Statistics, as the name implies, describes the data. Descriptive statistics consist of methods for organizing and summarizing information. These are statistical procedures that are used to organize, summarize, and simplify data. In these techniques, raw scores are taken, and some statistical techniques to obtain a more manageable form. These techniques allow the researcher to describe a large amount of information or scores in a few indices such as mean, median, standard deviation, etc. When these indices are calculated for a sample, they are called statistics; and when they are calculated from the entire population, they are called parameters (Fraenkel, Wallen, & Hyun, 2012). Descriptive statistics organizes scores in the form of a table or a graph. It is especially useful when the researcher finds it necessary to handle interrelationships among more than two variables.

Only summarizing and organizing data is not the whole purpose of a researcher. He often wishes to make inferences about a population based on data he has obtained from a sample. For this purpose, he uses inferential statistics. Inferential Statistics are techniques that allow a researcher to study samples and then make generalizations about the populations from which they are selected.

The population of a research study is typically too large and it is difficult for a researcher to observe each individual. Therefore a sample is selected. By analyzing the results obtained from a sample, a researcher hopes to make a general conclusion about the population. One problem with using a sample is that a sample provides only limited information about the population. To address this problem the notion that the sample should be representative of the population. That is, the general characteristics of the sample should be consistent with the characteristics of the population             


Related Topics

                     

New BISE Gazzets of the Current Year

All Punjab Gazzets Sargodha Board Gazzet 2024 10th class Lahore Board 10th Class Gazzet Part 1 Lahore Board 10th Class Gazzet Part 2