Canonical Tag Script

Wednesday, October 11, 2023

Calculating CGPA and Assigning Letter Grades|Educational Assessment and Evaluation|

 

QUESTION

Discuss the methods of calculating CGPA and assigning letter grades.
CourseEducational Assessment and Evaluation

Course code 8602

Level: B.Ed Solved Assignment 

ANSWER  

Calculating CGPA and Assigning Letter Grades

CGPA stands for Cumulative Grade Point Average. It reflects the grade point average of all subjects/courses regarding a student’s performance in a composite way. To calculate CGPA, we should have the following information.

Marks in each subject/course

 • Grade point average in each subject/course

• Total credit hours (by adding credit hours of each subject/course)

Calculating CGPA is very simple the total grade point average is divided by total credit hours. For example, if a student's MA Education program has studied 12 courses, each of 3 credits. The total credit hours will be 36. The CGPA will be 36/12 = 3.0

Assigning letter grades

The letter grade system is the most popular in the world including Pakistan. Most teachers face problems while assigning grades. There are four core problems or issues in this regard; 

1) what should be included in a letter grade,

 2) how should achievement data be combined in assigning letter grades?

 3) what frame of reference should be used in grading, and 

4) how should the distribution of letter grades be determined?

Determining what to include in a grade Letter grades are likely to be most meaningful and useful when they represent achievement only. If they are communicated with other factors or aspects such as effort of work completed, personal conduct, and so on, their interpretation will become hopelessly confused. For example, a letter grade of C may represent average achievement with extraordinary effort and excellent conduct and behavior or vice versa. If letter grades are to be valid indicators of achievement, they must be based on valid measures of achievement. This involves defining objectives as intended learning outcomes and developing or selecting tests and assessments that can measure these learning outcomes.

Combining data in assigning grades

 One of the key concerns while assigning grades is to be clear about what aspects of a student are to be assessed or what will be the tentative weightage to each learning outcome. For example, if we decide that 35 percent weightage is to be given to mid-term assessments, 40 percent final term tests or assessments, and 25% to assignments, presentations, classroom participation, and conduct and behavior; we have to combine all elements by assigning appropriate weights to each element, and then use these composite scores as a basis for grading.

Selecting the proper frame of reference for grading

 Letter grades are typically assigned based on one of the following frames of reference.

a)      Performance of other group members (relative grading)

b)      Performance concerning specified standards (absolute grading)

c)       Performance concerning learning ability (amount of improvement)

 

Assigning grades on a relative basis involves comparing a student’s performance with that of a reference group, mostly class fellows. In this system, the grade is determined by the student’s relative position or ranking in the total group. Although relative grading has the disadvantage of a shifting frame of reference (i.e. grades depend upon the group’s ability), it is still widely used in schools, as most of the time our system of testing is ‘norm-referenced’.

 

Assigning grades on an absolute basis involves comparing a student’s performance to specified standards set by the teacher. This is what we call ‘criterion-referenced’ testing. If all students show a low level of mastery consistent with the established performance standard, all will receive low grades. The student performance about the learning ability is inconsistent with a standard[1]based system of evaluating and reporting student performance. The improvement over a short period is difficult. Thus lack of reliability in judging achievement about ability and in judging degree of improvement will result in grades of low dependability. Therefore such grades are used as supplementary to other grading systems.

 

Determining the distribution of grades

 

 The assigning of relative grades is essentially a matter of ranking the student in order of overall achievement and assigning letter grades based on each student’s rank in the group. This ranking might be limited to a single classroom group or might be based on the combined distribution of several classroom groups taking the same course. If grading on the curve is to be done, the most sensible approach in determining the distribution of letter grades in a school is to have the school staff set general guidelines for introductory and advanced courses.

All staff members must understand the basis for assigning grades, and this basis must be clearly communicated to users of the grades. If the objectives of a course are clearly mentioned and the standards for mastery appropriately set, the letter grades in an absolute system may be defined as the degree to which the objectives have been attained, as follows. A = Outstanding (90 to 100%) B = Very Good (80-89%) C = Satisfactory (70-79%) D = Very Weak (60-69%) F = Unsatisfactory (Less than 60%)


Related Topics 


Tuesday, October 10, 2023

Interpreting Test Scores by ordering and ranking|Educational Assessment and Evaluation

QUESTION

Write how to interpret test scores by ordering and ranking?

CourseEducational Assessment and Evaluation

Course code 8602

Level: B.Ed Solved Assignment 

ANSWER  

Interpreting Test Scores by ordering and ranking

Organizing and reporting students’ scores starts with placing the scores in ascending or descending order. Teachers can find the smallest, largest, range, and some other facts like the variability of scores associated with scores from ranked scores. Teacher may use ranked scores to see the relative position of each student within the class but ranked scores does not yield any significant numerical value for result interpretation or reporting.

Measurement Scales

 Measurement is the assignment of numbers to objects or events in a systematic fashion. Measurement scales are critical because they relate to the types of statistics you can use to analyze your data. An easy way to have a paper rejected is to have used either an incorrect scale/statistic combination or to have used a low-powered statistic on a high-powered set of data. The following four levels of measurement scales are commonly distinguished so that the proper analysis can be used on the data a number can be used merely to label or categorize a response.

Nominal Scale.

Nominal scales are the lowest scales of measurement. A nominal scale, as the name implies, is simply some placing of data into categories, without any order or structure. You are only allowed to examine if a nominal scale datum is equal to some particular value or to count the number of occurrences of each value. For example, the categorization of blood groups of classmates into A, B AB, O, etc. The only mathematical operation we can perform with nominal data is to count. Variables assessed on a nominal scale are called categorical variables; Categorical data are measured on nominal scales which merely assign labels to distinguish categories. For example, gender is a nominal scale variable. Classifying people according to gender is a common application of a nominal scale.

Nominal Data

 • classification or categorization of data, e.g. male or female

• no ordering, e.g. it makes no sense to state that male is greater than female (M > F), etc

• arbitrary labels, e.g., pass=1 and fail=2, etc

Ordinal Scale.

 Something measured on an "ordinal" scale does have an evaluative connotation. You are also allowed to examine if an ordinal scale datum is less than or greater than another value. For example rating of job satisfaction on a scale from 1 to 10, with 10 representing complete satisfaction. With ordinal scales, we only know that 2 is better than 1 or 10 is better than 9; we do not know by how much. It may vary. Hence, you can 'rank' ordinal data, but you cannot 'quantify' differences between two ordinal values. Nominal scale properties are included in the ordinal scale.

Ordinal Data

• ordered but differences between values are not important. Differences between values may or may not same or equal.

 • e.g., political parties on left to right spectrum given labels 0, 1, 2

• e.g., Likert scales, rank on a scale of 1..5 your degree of satisfaction

 • e.g., restaurant ratings

 

Interval Scale

An ordinal scale has quantifiable differences between values and becomes an interval scale. You are allowed to quantify the difference between two interval scale values but there is no natural zero. A variable measured on an interval scale gives information about more or better than ordinal scales do, but interval variables have an equal distance between each value. The distance between 1 and 2 is equal to the distance between 9 and 10. For example, temperature scales are interval data with 25C warmer than 20C and a 5C difference has some physical meaning. Note that 0C is arbitrary so it does not make sense to say that 20C is twice as hot as 10C but there is the exact same difference between 100C and 90C as there is between 42C and 32C. Students’ achievement scores are measured on an interval scale

Interval Data

• ordered, constant scale, but no natural zero

• differences make sense, but ratios do not (e.g., 30°-20°=20°-10°, but 20°/10° is not twice as hot!

 • e.g., temperature (C, F), dates

Ratio Scale

Something measured on a ratio scale has the same properties that an interval scale has except, with a ratio scaling, there is an absolute zero point. Temperature measured in Kelvin is an example. There is no value possible below 0 degrees Kelvin, it is absolute zero. Physical measurements of height, weight, and length are typically ratio variables. Weight is another example, 0 lbs. is a meaningful absence of weight. This ratio holds true regardless of which scale the object is being measured in (e.g. meters or yards). This is because there is a natural zero.

Ratio Data

• ordered, constant scale, natural zero

 • e.g., height, weight, age, length One can think of nominal, ordinal, interval, and ratio as being ranked in their relation to one another. Ratio is more sophisticated than interval, interval is more sophisticated than ordinal, and ordinal is more sophisticated than nominal.


Related Topics 


Friday, October 6, 2023

What is a consideration while planning a test|Educational Assessment and Evaluation|

 

QUESTION

What is a consideration while planning a test?

CourseEducational Assessment and Evaluation

Course code 8602

Level: B.Ed Solved Assignment 

ANSWER  

Planning a Test

The main objective of classroom assessment is to obtain valid, reliable, and useful data regarding student learning achievement. This requires determining what is to be measured and then defining it precisely so that assessment tasks to measure desired performance can be developed. Classroom tests and assessments can be used for the following instructional objectives:

 i. Pre-testing

 Tests and assessments can be given at the beginning of an instructional unit or course to determine:-

• whether the students have the prerequisite skills needed for the instruction (readiness, motivation, etc)

• to what extent the students have already achieved the objectives of planned instruction (to determine placement or modification of instruction)

 ii. During the Instruction Testing

 • provides bases for formative assessment

• monitor learning progress

 • detect learning errors

 • provide feedback for students and teachers

 iii. End of Instruction Testing

 • measure intended learning outcomes

 • used for formative assessment

 • provides bases for grades, promotion, etc

 Prior to developing an effective test, one needs to determine whether or not a test is the appropriate type of assessment. If the learning objectives are primarily types of procedural knowledge (how to perform a task) then a written test may not be the best approach. Assessment of procedural knowledge generally calls for a performance demonstration assessed using a rubric. Where a demonstration of a procedure is not appropriate, a test can be an effective assessment tool.

The first stage of developing a test is planning the test content and length. Planning the test begins with the development of a blueprint or test specifications for the test structured on the learning outcomes or instructional objectives to be assessed by the test instrument. For each learning outcome, a weight should be assigned based on the relative importance of that outcome in the test. The weight will be used to determine the number of items related to each of the learning outcomes.

 

Test Specifications

 When an engineer prepares a design to construct a building and chooses the materials, he intends to use in construction, he usually knows what a building is going to be used for, and therefore designs it to meet the requirements of its planned inhabitants. Similarly, in testing, the table of specification is the blueprint of the assessment which specifies percentages and weightage of test items and measuring constructs. It includes constructs and concepts to be measured, the tentative weightage of each construct, a specific number of items for each concept, and a description of item types to be constructed. It is not surprising that specifications are also referred to as ‘blueprints’, for they are literally architectural drawings for test construction. Fulcher & Davidson (2009) divided test specifications into the following four elements:

 

Item specifications:

 Item specifications describe the items, prompts or tasks, and any other material such as texts, diagrams, and charts that are used as stimuli. Typically, a specification at this sub-level contains two key elements: samples of the tasks to be produced, and guiding language that details all information necessary to produce the task.

Presentation Model:

The presentation model provides information on how the items and tasks are presented to the test takers.

Assembly Model: 

The assembly model helps the test developer combine test items and tasks to develop a test format.

Delivery Model:

The delivery Model tells how the actual test is delivered. It includes information regarding test administration, test security/confidentiality, and time constraints.

 

Wednesday, October 4, 2023

Content Validity | Content Construct Validity|Educational Assessment and Evaluation |

 

QUESTION

Write a note on content validity and content construct validity.

CourseEducational Assessment and Evaluation

Course code 8602

Level: B.Ed Solved Assignment 

ANSWE

 

Content Validity

The evidence of the content validity is a judgmental process and may be formal or informal. The formal process has a systematic procedure which arrives at a judgment. The important components are the identification of behavioural objectives and the construction of a table of specifications. Content validity evidence involves the degree to which the content of the test matches a content domain associated with the construct. For example, a test of the ability to add two numbers should include a range of combinations of digits.

 A test with only one-digit numbers, or only even numbers, would not have good coverage of the content domain. Content-related evidence typically involves Subject Matter Experts (SMEs) evaluating test items against the test specifications. It is a non-statistical type of validity that involves “the systematic examination of the test content to determine whether it covers a representative sample of the behaviour domain to be measured” (Anastasi & Urbina, 1997). For example, does an IQ questionnaire have items covering all areas of intelligence discussed in the scientific literature? A test has content validity built into it by careful selection of which items to include (Anastasi & Urbina, 1997). Items are chosen so that they comply with the test specification which is drawn up through a thorough examination of the subject domain. Foxcraft et al. (2004, p. 49) note that by using a panel of experts to review the test specifications and the selection of items the content validity of a test can be improved. 

The experts will be able to review the items and comment on whether the items cover a representative sample of the behaviour domain.

For Example –

In developing a teaching competency test, experts in the field of teacher training would identify the information and issues required to be an effective teacher and then choose (or rate) items that represent those areas of information and skills which are expected from a teacher to exhibit in the classroom.

Lawshe (1975) proposed that each rater should respond to the following question for each item in content validity: Is the skill or knowledge measured by this item?

• Essential

• Useful but not essential

• Not necessarily Concerning educational achievement tests, a test is considered content valid when the proportion of the material covered in the test approximates the proportion of material covered in the course.

 

Construct Validity

Before defining the construct validity, it seems necessary to elaborate on the concept of the construct. It is the concept or the characteristic that a test is designed to measure. A construct provides the target that a particular assessment or set of assessments is designed to measure; it is a separate entity from the test itself. According to Howell (1992), Construct validity is a test’s ability to measure factors which are relevant to the field of study. Construct validity is thus an assessment of the quality of an instrument or experimental design. It says 'Does it measure the construct it is supposed to measure'. Construct validity is rarely applied in achievement tests. 

Construct validity refers to the extent to which operationalizations of a construct (e.g. practical tests developed from a theory) do actually measure what the theory says they do. For example, to what extent is an IQ questionnaire actually measuring "intelligence"? Construct validity evidence involves the empirical and theoretical support for the interpretation of the construct. Such lines of evidence include statistical analyses of the internal structure of the test including the relationships between responses to different test items. They also include relationships between the test and measures of other constructs. As currently understood, construct validity is not distinct from the support for the substantive theory of the construct that the test is designed to measure. As such, experiments designed to reveal aspects of the causal role of the construct also contribute to construct validity evidence. Construct validity occurs when the theoretical constructs of cause and effect accurately represent the real-world situations they are intended to model. This is related to how well the experiment is operationalized. A good experiment turns the theory (constructs) into actual things you can measure. Sometimes just finding out more about the construct (which itself must be valid) can be helpful.

The construct validity addresses the construct that is mapped into the test items, it is also assured either by the judgmental method or by developing the test specification before the development of the test. The constructs have some essential properties the two of them are listed as under: 1. Are abstract summaries of some regularity in nature? 2. Related with concrete, observable entities. For Example - Integrity is a construct; it cannot be directly observed, yet it is useful for understanding, describing, and predicting human behaviour.


Related Topics 


Monday, October 2, 2023

Extended Response Essay Type Items|Educational Assessment and Evaluation|

 QUESTION

Write a detailed note on extended response essay-type items

CourseEducational Assessment and Evaluation

Course code 8602

Level: B.Ed Solved Assignment 

ANSWE

Extended Response Essay Type Items

 An essay-type item that allows the student to determine the length and complexity of the response is called an extended-response essay item. This type of essay is most useful at the synthesis or evaluation levels of the cognitive domain. We are interested in determining whether students can organize, integrate, express, and evaluate information, ideas, or pieces of knowledge when the extended response items are used.

Example:

 Identify as many different ways to generate electricity in Pakistan as you can? Give the advantages and disadvantages of each. Your response will be graded on its accuracy, comprehension, and practical ability. Your response should be 8-10 pages in length and it will be evaluated according to the RUBRIC (scoring criteria) already provided.

Overall Essay type items (both types of restricted response and extended response) are

Good for:

 • Application, synthesis, and evaluation levels Types:

• Extended response: synthesis and evaluation levels; a lot of freedom in answers

• Restricted response: more consistent scoring, outlines parameters of responses

 Advantages:

• Students less likely to guess

 • Easy to construct

 • Stimulates more study

 Allows students to demonstrate an ability to organize knowledge, express opinions, and show originality.

Disadvantages:

• Can limit the amount of material tested, therefore has decreased validity.

 • Subjective, potentially unreliable scoring.

• Time-consuming to score.

Tips for Writing Good Essay Items:

• Provide reasonable time limits for thinking and writing.

 • Avoid letting them answer a choice of questions (You won't get a good idea of the broadness of student achievement when they only answer a set of questions.)

• Give definitive tasks to compare, analyze, evaluate, etc.

• Use a checklist point system to score with a model answer: write an outline, determine how many points to assign to each part

• Score one question at a time at the same time.


Related Topics 


Discuss Historical Research covering the Concept of Primary Sources, Secondary Sources Internal and External Criticism.

Discuss historical research covering the concept of primary sources, secondary sources internal and external criticism. Course: Research Met...