Canonical Tag Script

Showing posts with label Course Code 8614. Show all posts
Showing posts with label Course Code 8614. Show all posts

Sunday, December 3, 2023

Chi-Square | independent test | Educational Statistics | 8614 |

 QUESTION

Explain Chi-Square. Also, discuss it as an independent test.

  • CourseEducational Statistics
  • Course code 8614
  • Level: B.Ed Solved Assignment 

ANSWE

The Chi-Square Distribution

The Chi-Square (or the Chi-Squared - χ2) distribution is a special case of the gamma distribution (the gamma distribution is a family of right skewed, continuous probability distribution. These distributions are useful in real life where something has a natural minimum of 0.). a chi-square distribution with n degree of freedom is equal to a gamma distribution with a = n/2 and b = 0.5 (or β = 2).

Let us consider a random sample taken from a normal distribution. The chi-square distribution is the distribution of the sum of these random samples squared. The degrees of freedom (say k) are equal to the number of samples being summed. For example, if 10 samples are taken from the normal distribution, then the degree of freedom df = 10. Chi-square distributions are always right-skewed. The greater the degree of freedom, the more the chi-square distribution looks like a normal distribution.

Uses of Chi-Square (χ2) Distribution

The chi-square distribution has many uses which include:

i)  Confidence interval estimation for a population standard deviation of a normal distribution from a sample standard deviation.

ii)  Independence of two criteria of classification of qualitative variables (contingency tables).

iii)  Relationship between categorical variables.

iv)  Sample variance study when the underlying distribution is normal.

v)  Tests of deviations of differences between expected and observed frequencies (one-way table).

vi)  The chi-square test (a goodness of fit test).

What is a Chi-Square Statistic?

A Chi-Square Statistic is one way to a relationship between two categorical (non-numerical) variables. The Chi-Square Statistic a is a single number that tells us how much difference exists between the observed counts and the counts that one expects if there is no relationship in the population.

There are two different types of chi-square tests, both involve categorical data. These are:

a)  A chi-square goodness of fit test, and

b)  A chi-square test of independence.

In the coming lines, these tests will be dealt with in some detail.

 

Chi-Square Independence Test

A chi-square (χ2) test of independence is the second important form of a chi-square test. It is used to explore the relationship between two categorical variables. Each of these variables can have two or more categories.

It determines if there is a significant relationship between two nominal (categorical) variables. The frequency of one nominal variable is compared with different values of the second nominal variable. The data can be displayed in the R*C contingency table, where R is the row and C is the column. For example, the researcher wants to examine the relationship between gender (male and female) and empathy (high vs. low). The researcher will use the chi-square test of independence. If the null hypothesis is accepted there would be no relationship between gender and empathy. If the null hypothesis is rejected then the conclusion will be there is a relationship between gender and empathy (e.g. say females tend to score higher on empathy and males tend to score lower on empathy).

The chi-square test of independence being a non-parametric technique follows less strict assumptions, some general assumptions should be taken care of:

Random Sample –

 The sample should be selected using a simple random sampling method.

Variables –

Both variables under study should be categorical.

Independent Observations –

Each person or case should be counted only once and none should appear in more than one category of group. The data from one subject should not influence the data from another subject.

If the data are displayed in a contingency table, the expected frequency count for each cell of the table is at least 5.

Both the chi-square tests are sometimes confused but they are quite different from each other.

  • The chi-square test for independence compares two sets of data to see if there is a relationship.
  • The chi-square goodness of fit test is to fit one categorical variable to a distribution.


Related Topics

ANOVA and its Logics

Measures of Dispersion

Descriptive and Inferential Statistics

What is data Cleaning? Importance and Benefits of Data Cleaning 

Explain the terms Degree of Freedom,Spread of Score,Sample,Z Score,Confidence Interval 

What is measure of difference? Explain different types of test

Concept of Reliability, Types and methods of Reliability

Level of Measurement

Types of Variable in Stats 

Measures of Central Tedency and Dispersion, 

Role of Normal Distribution, and also note on Skewness and Kurtosis

Methods of Effective Presentation


Friday, December 1, 2023

ANOVA and its logics | Educational Statistics |8614|

QUESTION

Explain ANOVA and its Logic?

  • Course: Educational Statistics
  • Course code 8614
  • Level: B.Ed Solved Assignment

ANSWE

Introduction to Analysis of Variance (ANOVA)

The t-tests have one very serious limitation  –  they are restricted to tests of the significance of the difference between only two groups. There are many times when we like to see if there are significant differences among three, four, or even more groups. For example, we may want to investigate which of three teaching methods is best for teaching ninth-class algebra. In such a case, we cannot use a t-test because more than two groups are involved. To deal with such types of cases one of the most useful techniques in statistics is analysis of variance (abbreviated as ANOVA). This technique was developed by British Statistician Ronald A. Fisher (Dietz & Kalof, 2009; Bartz, 1981)

Analysis of Variance (ANOVA) is a hypothesis testing procedure that is used to evaluate mean differences between two or more treatments (or populations). Like all other inferential procedures. ANOVA uses sample data as a basis for drawing general conclusions about populations. Sometimes, ANOVA and t-tests may be two different ways of doing exactly the same thing: testing for mean differences. In some cases this is true – both tests use sample data to test hypotheses about population mean.

However, ANOVA has many more advantages over t-tests. t-tests are used when we have to compare only two groups or variables (one independent and one dependent). On the other hand, ANOVA is used when we have two or more two independent variables (treatment). Suppose we want to study the effects of three different models of teaching on the achievement of students. In this case, we have three different samples to be treated using three different treatments. So ANOVA is the suitable technique to evaluate the difference.

 

Logic of ANOVA

Let us take the hypothetical data given in the table.




There are three separate samples, with n = 5 in each sample. The dependent variable is the number of problems solved correctly These data represent results of an independent-measure experiment comparing learning performance under three temperature conditions. The scores are variable and we want to measure the amount of variability (i.e. the size of the difference) to explain where it comes from.

 To compare the total variability, we will combine all the scores from all the separate samples into one group and then obtain one general measure of variability for the complete experiment. Once we have measured the total variability, we can begin to break it into separate components. The word analysis means breaking into smaller parts. 

Because we are going to analyze the variability, the process is called analysis of variance (ANOVA). This analysis process divides the total variability into two basic components:

i)  Between-Treatment Variance

Variance simply means difference and calculating the variance is a process of measuring how big the differences are for a set of numbers. The between-treatment variance measures how much difference exists between the treatment conditions. In addition to measuring differences between treatments, the overall goal of ANOVA is to evaluate the differences between treatments. Specifically, the purpose of the analysis is to distinguish is to distinguish between two alternative explanations.

a)  The differences between the treatments have been caused by the treatment effects.

b)  The differences between the treatments are simply due to chance. 

Thus, there are always two possible explanations for the variance (difference) that exists between treatments

1)  Treatment Effect: 

The differences are caused by the treatments. the scores in sample 1 are obtained at room temperature of 50and that of sample 2 at 70o. The difference between samples may be caused by the difference in room temperature.

2)  Chance: 

The differences are simply due to chance. If there is no treatment effect, even then we can expect some difference between samples. The chance differences are unplanned and unpredictable differences that are not caused or explained by any action of the researcher. Researchers commonly identify two primary sources for chance differences.

  Individual Differences

Each participant in the study has their own individual characteristics. Although it is reasonable to expect that different subjects will produce different scores, it is impossible to predict exactly what the difference will be. 

   Experimental Error

In any measurement, there is a chance of some degree of error. Thus, if a researcher measures the same individuals twice under the same conditions, there is a greater possibility of obtaining two different measurements. Often these differences are unplanned and unpredictable, so they are considered to be by chance. 

Thus, when we calculate the between-treatment variance, we are measuring differences that could be either by treatment effect or could simply be due to chance. To demonstrate that the difference is really a treatment effect, we must establish that the differences between treatments are bigger than would be expected by chance alone. To accomplish this goal, we will determine how big the differences are when there is no treatment effect involved. That is, we will measure how much difference (variance) occurred by chance. To measure chance differences, we compute the variance within treatments

ii)  Within-Treatment Variance

Within each treatment condition, we have a set of individuals who are treated exactly the same and the researcher does not do anything that would cause these individual participants to have different scores. For example,  the data shows that five individuals were treated at a70oroom temperature. Although these five students were all treated exactly the same, their scores are different. The question is why are the scores different? A plain answer is that it is due to chance.  the overall analysis of variance and identifies the sources of variability that are measured by each of the two basic components.










Related Topics

Tuesday, November 21, 2023

Procedure for Determining Median | Merits of Median | Demerits of Median | Educational Statistics | 8614 |

 

QUESTION

How do we calculate the median? Also, mention its merits and demerits.
CourseEducational Statistics

Course code 8614

Level: B.Ed Solved Assignment 

ANSWER

Median

The median is the middle value of rank order data. It divides the distribution into two halves (i.e. 50% of scores or observations on either side of the median value). It means that this value separates the higher half of the data set from the lower half. The goal of the median is to determine the precise midpoint of the distribution. The median is appropriate for describing ordinal data.

Procedure for Determining Median

When the number of scores is odd, simply arrange the scores in order (from lower to higher or from higher to lower). The median will be the middle score in the list. Consider the set of scores 2, 5, 7, 10, 12. The score “7” lies in the middle of the scores, so it is the median. When there is an even number of scores in the distribution, arrange the scores in order (from lower to higher or from higher to lower). The median will be the average of the middle two scores in the list. Consider the set of scores 4, 6, 9, 14 16, 20. The average of the middle two scores 11.5 (i.e. 9+14/2 = 23/2 = 11.5) is the median of the distribution.

The median is less affected by outliers and skewed data and is usually the preferred measure of central tendency when the distribution is not symmetrical. The median cannot be determined for categorical or nominal data.

Merits of Median

i)  It is rigidly defined.

ii)  It is easy to understand and calculate.

iii)  It is not affected by extreme values.

iv)  Even if the extreme values are not known median can be calculated.

v)  It can be located just by inspection in many cases.

vi)  It can be located graphically.

vii)  It is not much affected by sampling fluctuations.

viii)  It can be calculated by data based on an ordinal scale.

ix)  It is suitable for skewed distribution.

x)  It is easily located in individual and discrete classes.

Demerits of Median

i)  It is not based on all values of the given data.

ii)  For larger data sizes the arrangement of the data in increasing order is

a somewhat difficult process.

iii)  It is not capable of further mathematical treatment.

iv)  It is not sensitive to some change in the data value.

v)  It cannot be used for further mathematical processing.


Related Topics

ANOVA and its Logics

Median (Procedure of Determination, Merits, Demerits)

Measures of Dispersion

Descriptive and Inferential Statistics

What is data Cleaning? Importance and Benefits of Data Cleaning 

Explain the terms Degree of Freedom,Spread of Score,Sample,Z Score,Confidence Interval 

What is measure of difference? Explain different types of test

Concept of Reliability, Types and methods of Reliability

Level of Measurement

Types of Variable in Stats 

Measures of Central Tedency and Dispersion, 

Role of Normal Distribution, and also note on Skewness and Kurtosis. 

Methods of Effective Presentation

Monday, November 20, 2023

Measures of Dispersion | Educational Statistics | 8614 |

QUESTION

Explain different measures of dispersion used in educational research.

  • Course: Educational Statistics
  • Course code 8614
  • Level: B.Ed Solved Assignment

ANSWER

Introduction to Measures of Dispersion

Measures of central tendency focus on what is an average or in the middle of the distribution of scores. Often the information provided by these measures does not give us a clear picture of the data and we need something more. It means that knowing the mean, median, and mode of a distribution does allow us to differentiate between two or more than two distributions; and we need additional information about the distribution. This additional information is provided by a series of measures which are commonly known as measures of dispersion.

There is dispersion when there is dissimilarity among the data values. The greater the dissimilarity, the greater the degree of dispersion will be.

Measures of dispersion are needed for four basic purposes.

i)  To determine the reliability of an average.

ii)  To serve as a basis for the control of the variability.

iii)  To compare two or more series about their variability.

iv)  To facilitate the use of other statistical measures.

 

The measure of dispersion enables us to compare two or more series concerning their variability. It is also looked at as a means of determining uniformity or consistency. A high degree would mean little consistency or uniformity whereas a low degree of variation would mean greater uniformity or consistency among the data set. Commonly used measures of dispersion are range, quartile deviation, mean deviation, variance, and standard deviation.

Range

The range is the simplest measure of spread and is the difference between the highest and lowest scores in a data set. In other words, we can say that the range is the distance between the largest score and the smallest score in the distribution. We can calculate the range as:

Range = Highest value of the data – The lowest value of the data


For example, if the lowest and highest marks scored in a test are 22 and 95 respectively, then

Range = 95 – 22 = 73

The range is the easiest measure of dispersion and is useful when you wish to evaluate the whole of a dataset. However, it is not considered a good measure of dispersion as it does not utilize the other information related to the spread. The outliers, either extremely low or extremely high value, can considerably affect the range.

Quartiles

The values that divide the given set of data into four equal parts are called quartiles and are denoted by Q1, Q2, and Q3. Q1  is called the lower quartile and Q3 is called the upper quartile. 25% of scores are less than Q1and 75% scores are less than Q3. Q2 is the median. The formulas for the quartiles are:


Quartile Deviation (QD)

Quartile deviation or semi inter-quartile range is one-half the difference between the first and the third quartile, i.e.

Q D = Q3 – Q1

Where Q1 = the first quartile (lower quartile)

Q3 = third quartile (upper quartile)

Calculating quartile deviation from ungrouped date:

To calculate quartile deviation from ungrouped data, the following steps are used.

i)  Arrange the test scores from highest to lowest

ii)  Assign a serial number to each score. The first serial number is assigned to the lowest score.


Determine the first quartile (Q1) by using the formula



Use the obtained value to locate the serial number of the score that falls under Q1.

iv  Determine the third (Q3), by using the formula



Locate the serial number corresponding to the obtained answer. Opposite to this number is the test score corresponding to Q3.

v)  Subtract the Q1 from Q3, and divide the difference by 2.

Mean Deviation or Average Deviation

The mean or the average deviation is defined as the arithmetic mean of the deviations of the scores from the mean or the median. The deviations are taken as positive. Mathematically, 

Standard Deviation

Standard deviation is the most commonly used and the most important measure of variation. It determines whether the scores are generally near or far from the mean, i.e. are the scores clustered together or scattered. In simple words, standard deviation tells how tightly all the scores are clustered around the mean in a data set. When the scores are close to the mean, the standard deviation is small. And large standard deviation tells that the scores are spread apart. Standard deviation is simply the square root of variance, i.e.

 Variance



Related Topics


Saturday, November 11, 2023

Descriptive and Inferential Statistics | Educational Statistics |

 QUESTION

How do descriptive and inferential statistics help a teacher? Explain. 

CourseEducational Statistics
Course code 8614
Level: B.Ed Solved Assignment 

ANSWE

Descriptive and Inferential Statistics 

Researchers use a variety of statistical procedures to organize and interpret data. These procedures can be classified into two categories – Descriptive Statistics and Inferential Statistics. The starting point for dealing with a collection of data is to organize, display, and summarize it effectively. It is the major objective of descriptive statistics.

Descriptive Statistics, as the name implies, describes the data. Descriptive statistics consist of methods for organizing and summarizing information. These are statistical procedures that are used to organize, summarize, and simplify data. In these techniques, raw scores are taken, and some statistical techniques to obtain a more manageable form. These techniques allow the researcher to describe a large amount of information or scores in a few indices such as mean, median, standard deviation, etc. When these indices are calculated for a sample, they are called statistics; and when they are calculated from the entire population, they are called parameters (Fraenkel, Wallen, & Hyun, 2012). Descriptive statistics organizes scores in the form of a table or a graph. It is especially useful when the researcher finds it necessary to handle interrelationships among more than two variables.

Only summarizing and organizing data is not the whole purpose of a researcher. He often wishes to make inferences about a population based on data he has obtained from a sample. For this purpose, he uses inferential statistics. Inferential Statistics are techniques that allow a researcher to study samples and then make generalizations about the populations from which they are selected.

The population of a research study is typically too large and it is difficult for a researcher to observe each individual. Therefore a sample is selected. By analyzing the results obtained from a sample, a researcher hopes to make a general conclusion about the population. One problem with using a sample is that a sample provides only limited information about the population. To address this problem the notion that the sample should be representative of the population. That is, the general characteristics of the sample should be consistent with the characteristics of the population             


Related Topics

                     

Friday, June 11, 2021

What is data cleaning? Write down its importance and benefits. How to ensure it before analysis of data.| Introduction to Educational Statistics | aiou solved assignment | Course Code 8614

 

Q.5 What is data cleaning? Write down its importance and benefits. How to ensure it before the analysis of data? 

Course:  Introduction to Educational Statistics 

Course Code 8614

Topics 

  • What is Data Cleaning?
  • Importance of Data cleaning
  • Benefits of Data cleaning
  • Data Cleansing for a Cleaner Database

AIOU Solved Assignment |Semester: Autumn/Spring | B.Ed/Bachelors in Education /Masters in Education / PhD in Education | BEd / MEd / M Phil Education | ASSIGNMENT Course Code 8614| course: Introduction to Educational Statistics

Answer:

 

Data cleansing or data cleaning is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database and refers to identifying incomplete, incorrect, inaccurate, or irrelevant parts of the data and then replacing, modifying, or deleting the dirty or coarse data. Data cleansing may be performed interactively with data wrangling tools or as batch processing through scripting. 

After cleansing, a data set should be consistent with other similar data sets in the system. The inconsistencies detected or removed may have been originally caused by user entry errors, corruption in transmission or storage, or by different data dictionary definitions of similar entities in different stores. Data cleaning differs from data validation in that validation almost invariably means data is rejected from the system at entry and is performed at the time of entry, rather than on batches of data. 

The actual process of data cleansing may involve removing typographical errors or validating and correcting values against a known list of entities. The validation may be strict (such as rejecting any address that does not have a valid postal code) or fuzzy (such as correcting records that partially match existing, known records). Some data cleansing solutions will clean data by cross-checking with a validated data set. A common data cleansing practice is data enhancement, where data is made more complete by adding related information. For example, appending addresses with any phone numbers related to that address.

Data cleansing may also involve activities like harmonization of data, and standardization of data. For example, harmonization of shortcodes (st, rd, etc.) to actual words (street, road, and etcetera). Standardization of data is a means of changing a reference data set to a new standard, ex, the use of standard codes. 

Data cleansing is a valuable process that can help companies save time and increase their efficiency. Data cleansing software tools are used by various organizations to remove duplicate data and fix and amend badly formatted, incorrect, and incomplete data from marketing lists, databases, and CRMs.   They can achieve in a short period what could take days or weeks for an administrator to work manually to fix. This means that companies can save not only time but money by acquiring data-cleaning tools. 

Data cleansing is of particular value to organizations that have vast swathes of data to deal with. These organizations can include banks or government organizations but small to medium enterprises can also find a good use for the programmers. In fact, it’s suggested by many sources that any firm that works with and holds data should invest in cleansing tools. The tools should also be used regularly as inaccurate data levels can grow quickly, compromising the database and decreasing business efficiency.

 

Data Cleansing for a Cleaner Database 

Companies may also find that cleansing enables them to remain compliant with standards that are legally expected of them. In most territories, companies are duty-bound to ensure that their data is as accurate and current as possible. The tools can be used for everything from correcting spelling mistakes to postcodes, whilst removing unnecessary records from systems, which means that space, can be preserved and that information that is no longer needed – or data that companies are no longer permitted to keep – can be removed simply, quickly and efficiently. 

 Users of data cleansing software can set their own rules to increase the efficiency of a database, making the capabilities of the cleansing software as applicable to the company’s needs and requirements as possible. Some common problems with databases can also include incorrectly formatted phone numbers and e-mail addresses, rendering clients and customers uncontestable.

The software can be used to put things right in a matter of seconds. This makes it a perfect tool for companies that need to stay in touch with outside parties. Meanwhile, companies that employ more than one database    companies that are spread across various branches or offices for example  – can use the tools to ensure that each branch of their organization can share the same accurate information.


Explain the terms Degree of Freedom,Spread of Score,Sample,Z Score,Confidence Interval.| Introduction to Educational Statistics | aiou solved assignment | Course Code 8614

 

Q.4 Explain the following terms with examples.

 

Course:  Introduction to Educational Statistics 

Course Code 8614

Topics 

  • Degree of Freedom
  • Spread of Scores
  •  Sample
    • When can we measure spread?
    • Why do we measure spread?
    • Confidence Interval
  • Z Score
  • Confidence Interval

 

AIOU Solved Assignment |Semester: Autumn/Spring | B.Ed/Bachelors in Education /Masters in Education / PhD in Education | BEd / MEd / M Phil Education | ASSIGNMENT Course Code 8614| course: Introduction to Educational Statistics

a)  Degree of Freedom 

In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.  The number of independent ways by which a dynamic system can move, without violating any constraint imposed on it, is called several degrees of freedom. In other words, the number of degrees of freedom can be defined as the minimum number of independent coordinates that can specify the position of the system completely. 

Estimates of statistical parameters can be based on different amounts of information or data. The numbers of independent pieces of information that go into the estimate of a parameter are called the degrees of freedom. In general, the degrees of freedom of an estimate of a parameter are equal to the number of independent scores that go into the estimate minus the number of parameters used as intermediate steps in the estimation of the parameter itself (e.g. the sample variance has N − 1 degrees of freedom, since it is computed from N random scores minus the only 1 parameter estimated as intermediate step, which is the sample mean). 

Mathematically, degrees of freedom is the number of dimensions of the domain of a random vector, or essentially the number of "free" components (how many components need to be known before the vector is fully determined). The term is most often used in the context of linear models (linear regression, analysis of variance), where certain random vectors are constrained to lie in linear subspaces, and the number of degrees of freedom is the dimension of the subspace. The degrees of freedom are also commonly associated with the squared lengths (or "sum of squares" of the coordinates) of such vectors, and the parameters of chi-squared and other distributions that arise in associated statistical testing problems. 

While introductory textbooks may introduce degrees of freedom as distribution parameters or through hypothesis testing, it is the underlying geometry that defines degrees of freedom and is critical to a proper understanding of the concept. Walker (1940) has stated this succinctly as "the number of observations minus the number of necessary relations among these observations."

 

b)  Spread of Scores 

Measures of spread describe how similar or varied the set of observed values are for a particular variable (data item). Measures of spread include the range, quartiles, interquartile range, variance, and standard deviation.

When can we measure spread?

The spread of the values can be measured for quantitative data, as the variables are numeric and can be arranged into a logical order with a low-end value and a high-end value.

Why do we measure spread?

Summarizing the dataset can help us understand the data, especially when the dataset is large. As discussed in the Measures of Central Tendency page, the mode, median, and mean summarize the data into a single value that is typical or representative of all the values in the dataset, but this is only part of the 'picture' that summarizes a dataset. Measures of spread summarize the data in a way that shows how the values scattered are and how much they differ from the mean value.

 

c)  Sample

 In statistics and quantitative research methodology, a data sample is a set of data collected and/or selected from a statistical population by a defined procedure.  The elements of a sample are known as sample points, sampling units, or observations. Typically, the population is very large, making a census or a complete enumeration of all the values in the population either impractical or impossible. The sample usually represents a subset of manageable size. Samples are collected and statistics are calculated from the samples so that one can make inferences or extrapolations from the sample to the population. 

The data sample may be drawn from a population without replacement (i.e. no element can be selected more than once in the same sample), in which case it is a subset of a population; or with replacement (i.e. an element may appear multiple times in the one sample), in which case it is a multi subset.

 

d)  Confidence Interval 

In statistics, a confidence interval (CI) is a type of interval estimate (of a population parameter) that is computed from the observed data. The confidence level is the frequency (i.e., the proportion) of possible confidence intervals that contain the true value of their corresponding parameter. 

 In other words, if confidence intervals are constructed using a given confidence level in an infinite number of independent experiments, the proportion of those intervals that contain the true value of the parameter will match the confidence level. Confidence intervals consist of a range of values (interval) that act as good estimates of the unknown population parameter. 

 However, the interval computed from a particular sample does not necessarily include the true value of the parameter. Since the observed data are random samples from the true population, the confidence interval obtained from the data is also random. If a corresponding hypothesis test is performed, the confidence level is the complement of the level of significance; for example, a 95% confidence interval reflects a significance level of 0.05. If it is hypothesized that a true parameter value is 0 but the 95% confidence interval does not contain 0, then the estimate is significantly different from zero at the 5% significance level. 

The desired level of confidence is set by the researcher (not determined by data). Most commonly, the 95% confidence level is used. However, other confidence levels can be used, for example, 90% and 99%. Factors affecting the width of the confidence interval include the size of the sample, the confidence level, and the variability in the sample. A larger sample size normally will lead to a better estimate of the population parameter.  Confidence intervals were introduced to statistics by Jerzy Neyman in a paper published in 1937.

 

e)  Z Score

Simply put, a z-score is the number of standard deviations from the mean of a data point. But more technically it’s a measure of how many standard deviations below or above the population mean a raw score is. A z-score is also known as a standard score and it can be placed on a normal distribution curve. Z-scores range from  -3 standard deviations (which would fall to the far left of the normal distribution curve) up to +3 standard deviations (which would fall to the far right of the normal distribution curve). To use a z-score, you need to know the mean μ and also the population standard deviation σ. 

Z-scores are a way to compare results from a test to a “normal” population. Results from tests or surveys have thousands of possible results and units. However, those results can often seem meaningless. For example, knowing that someone’s weight is 150 pounds might be good information, but if you want to compare it to the “average” person’s weight, looking at a vast table of data can be overwhelming (especially if some weights are recorded in kilograms). A z-score can tell you where that person’s weight is compared to the average population’s mean weight.



Related Topics

New BISE Gazzets of the Current Year

All Punjab Gazzets Sargodha Board Gazzet 2024 10th class Lahore Board 10th Class Gazzet Part 1 Lahore Board 10th Class Gazzet Part 2