Now Accepting Apple Pay

Apple Pay is the easiest and most secure way to pay on StudyMoose in Safari.

Data Preprocessing

Categories: DataTechnology

Data Preprocessing 3 Today’s real-world databases are highly susceptible to noisy, missing, and inconsistent data due to their typically huge size (often several gigabytes or more) and their likely origin from multiple, heterogenous sources. Low-quality data will lead to low-quality mining results. “How can the data be preprocessed in order to help improve the quality of the data and, consequently, of the mining results? How can the data be preprocessed so as to improve the ef? ciency and ease of the mining process? ” There are several data preprocessing techniques.

Data cleaning can be applied to remove noise and correct inconsistencies in data. Data integration merges data from multiple sources into a coherent data store such as a data warehouse. Data reduction can reduce data size by, for instance, aggregating, eliminating redundant features, or clustering. Data transformations (e. g. , normalization) may be applied, where data are scaled to fall within a smaller range like 0. 0 to 1. 0. This can improve the accuracy and ef? ciency of mining algorithms involving distance measurements.

Get quality help now
Verified writer

Proficient in: Data

4.9 (247)

“ Rhizman is absolutely amazing at what he does . I highly recommend him if you need an assignment done ”

+84 relevant experts are online
Hire writer

These techniques are not mutually exclusive; they may work together.

For example, data cleaning can involve transformations to correct wrong data, such as by transforming all entries for a date ? eld to a common format. In Chapter 2, we learned about the different attribute types and how to use basic statistical descriptions to study data characteristics. These can help identify erroneous values and outliers, which will be useful in the data cleaning and integration steps. Data processing techniques, when applied before mining, can substantially improve the overall quality of the patterns mined and/or the time required for the actual mining.

Get to Know The Price Estimate For Your Paper
Number of pages
Email Invalid email

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email

"You must agree to out terms of services and privacy policy"
Check writers' offers

You won’t be charged yet!

Cite this page

Data Preprocessing. (2018, Sep 28). Retrieved from

👋 Hi! I’m your smart assistant Amy!

Don’t know where to start? Type your requirements and I’ll connect you to an academic expert within 3 minutes.

get help with your assignment