Canonical Tag Script

Friday, June 11, 2021

What is data cleaning? Write down its importance and benefits. How to ensure it before analysis of data.| Introduction to Educational Statistics | aiou solved assignment | Course Code 8614

 

Q.5 What is data cleaning? Write down its importance and benefits. How to ensure it before the analysis of data? 

Course:  Introduction to Educational Statistics 

Course Code 8614

Topics 

  • What is Data Cleaning?
  • Importance of Data cleaning
  • Benefits of Data cleaning
  • Data Cleansing for a Cleaner Database

AIOU Solved Assignment |Semester: Autumn/Spring | B.Ed/Bachelors in Education /Masters in Education / PhD in Education | BEd / MEd / M Phil Education | ASSIGNMENT Course Code 8614| course: Introduction to Educational Statistics

Answer:

 

Data cleansing or data cleaning is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database and refers to identifying incomplete, incorrect, inaccurate, or irrelevant parts of the data and then replacing, modifying, or deleting the dirty or coarse data. Data cleansing may be performed interactively with data wrangling tools or as batch processing through scripting. 

After cleansing, a data set should be consistent with other similar data sets in the system. The inconsistencies detected or removed may have been originally caused by user entry errors, corruption in transmission or storage, or by different data dictionary definitions of similar entities in different stores. Data cleaning differs from data validation in that validation almost invariably means data is rejected from the system at entry and is performed at the time of entry, rather than on batches of data. 

The actual process of data cleansing may involve removing typographical errors or validating and correcting values against a known list of entities. The validation may be strict (such as rejecting any address that does not have a valid postal code) or fuzzy (such as correcting records that partially match existing, known records). Some data cleansing solutions will clean data by cross-checking with a validated data set. A common data cleansing practice is data enhancement, where data is made more complete by adding related information. For example, appending addresses with any phone numbers related to that address.

Data cleansing may also involve activities like harmonization of data, and standardization of data. For example, harmonization of shortcodes (st, rd, etc.) to actual words (street, road, and etcetera). Standardization of data is a means of changing a reference data set to a new standard, ex, the use of standard codes. 

Data cleansing is a valuable process that can help companies save time and increase their efficiency. Data cleansing software tools are used by various organizations to remove duplicate data and fix and amend badly formatted, incorrect, and incomplete data from marketing lists, databases, and CRMs.   They can achieve in a short period what could take days or weeks for an administrator to work manually to fix. This means that companies can save not only time but money by acquiring data-cleaning tools. 

Data cleansing is of particular value to organizations that have vast swathes of data to deal with. These organizations can include banks or government organizations but small to medium enterprises can also find a good use for the programmers. In fact, it’s suggested by many sources that any firm that works with and holds data should invest in cleansing tools. The tools should also be used regularly as inaccurate data levels can grow quickly, compromising the database and decreasing business efficiency.

 

Data Cleansing for a Cleaner Database 

Companies may also find that cleansing enables them to remain compliant with standards that are legally expected of them. In most territories, companies are duty-bound to ensure that their data is as accurate and current as possible. The tools can be used for everything from correcting spelling mistakes to postcodes, whilst removing unnecessary records from systems, which means that space, can be preserved and that information that is no longer needed – or data that companies are no longer permitted to keep – can be removed simply, quickly and efficiently. 

 Users of data cleansing software can set their own rules to increase the efficiency of a database, making the capabilities of the cleansing software as applicable to the company’s needs and requirements as possible. Some common problems with databases can also include incorrectly formatted phone numbers and e-mail addresses, rendering clients and customers uncontestable.

The software can be used to put things right in a matter of seconds. This makes it a perfect tool for companies that need to stay in touch with outside parties. Meanwhile, companies that employ more than one database    companies that are spread across various branches or offices for example  – can use the tools to ensure that each branch of their organization can share the same accurate information.


No comments:

Post a Comment

If you have any question related to children education, teacher education, school administration or any question related to education field do not hesitate asking. I will try my best to answer. Thanks.

Discuss Historical Research covering the Concept of Primary Sources, Secondary Sources Internal and External Criticism.

Discuss historical research covering the concept of primary sources, secondary sources internal and external criticism. Course: Research Met...