Data Cleansing Strategies on Data Sets Become Data Science

Sardjono Sardjono, R. Yadi Rakhman Alamsyah, Marwondo Marwondo, Elia Setiana

Abstract


The digital era very grows up with the increasing using of smartphone and many organization or companies was implemented of a system to support their business. That is who will increase the volume of usage and dissemination of data, neither through open nor closed internet networks. Because there is the need to process large data and how to get it from different store resource, so requirement strategy to process the data according to the rule of good, effective and efficient in activity data cleansing until the data set can be use as mature and very useful information for their business purpose. By using the R languaged who can process large data and has data complexity for the data loaded from different storage resource can be done as well as. To using R languaged maximally, so we have to a basic skill that needed to process the data set which will be used to be data scient for organizations or companies by good data cleansing techniques. In this research on Data Cleansing Strategies on data set owned by organizations,will describe the correct step by step to obtaining data that very useful to be uses as data science for organization so by the data that generated after the data cleansing process is very meaningful and useful for making decisions, other than that this research give basic overview and guide to the beginner all data scientists by doing data cleansing in the way stages and also provides a way to analyze from the result of execution some functions used.


Keywords


Data, Data Scient, Data Cleansing, Data Set, data Profiling, R Languaged, factor, Step-by Step, function, Library, Data Enrichment.

Full Text:

PDF

References


Brayne, S. (2017). Big data surveillance: The case of policing. American sociological review, 82(5), 977-1008.

Dietrich, D. (2015). Data science and big data analytics: Discovering, analyzing, visualizing and presenting data. New York: John Wiley & Sons.

Endel, F., and Piringer, H. (2015). Data Wrangling: Making data useful again. IFAC-PapersOnLine, 48(1), 111-112.

Faisal, M. R. (2016). Seri Belajar Pemrograman: Pengenalan Bahasa Pemrograman R, Jakarta: Indonesia Net Depelover Community.

Huerta, E., and Jensen, S. (2017). An accounting information systems perspective on data analytics and Big Data. Journal of Information Systems, 31(3), 101-114.

Kandel, S., Heer, J., Plaisant, C., Kennedy, J., Van Ham, F., Riche, N. H and Buono, P. (2011). Research directions in data wrangling: Visualizations and transformations for usable and credible data. Information Visualization, 10(4), 271-288.

Kowalczyk, M., and Buxmann, P. (2014). Big data and information processing in organizational decision processes. Business & Information Systems Engineering, 6(5), 267-278.

Mailund, T. (2017). Beginning Data Science in R: Data Analysis, Visualization, and Modelling for the Data Scientist. Apress. Denmark: Apress.

Patil, M. M., and Hiremath, B. N. (2018). A systematic study of data wrangling. Int. J. Inf. Technol. Comput. Sci.(IJITCS), 1, 32-39.

Sivaparthipan, C. B., Karthikeyan, N., and Karthik, S. (2020). Designing statistical assessment healthcare information system for diabetics analysis using big data. Multimedia Tools and Applications, 79(13), 8431-8444.

Whyte, J., Stasis, A., and Lindkvist, C. (2016). Managing change in the delivery of complex projects: Configuration management, asset information and ‘big data’. International Journal of Project Management, 34(2), 339-351.

Wu, X., Zhu, X., Wu, G. Q., & Ding, W. (2013). Data mining with big data. IEEE transactions on knowledge and data engineering, 26(1), 97-107.




DOI: https://doi.org/10.46336/ijqrm.v1i3.71

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Published By: 

IJQRM: Jalan Riung Ampuh No. 3, Riung Bandung, Kota Bandung 40295, Jawa Barat, Indonesia

 

IJQRM Indexed By: 

width= width= width= width= width= width=

 


Lisensi Creative Commons Creation is distributed below Lisensi Creative Commons Atribusi 4.0 Internasional.


View My Stats