Data Manipulation
Data manipulation involves preparing large data sets into a form required for statistical analysis. When you have a large data set most of it can be unrelated to what you are trying to accomplish. Data manipulation can be quite complex but very important for achieving the goals of the analysis.
Data manipulation covers a wide variety of tasks, such as:
getting data from text files, spreadsheets, databases and other sources and inputting them into an appropriate statistical package
manipulating date/time data and character manipulation
aggregating data and reshaping data
SOME RECOMMENDED RESOURCES
Wikham, H. and Grolemund, G. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly, CA.
Spector, P. 2008. Data Manipulation with R. Springer, New York.
Nolan, D. and Temple Lang, D. 2014. XML and Web Technologies for Data Sciences with R. Springer, New York.
Cody, R. 2008. Cody’s Data Cleaning Techniques using SAS, 2nd. Edition, SAS Institute.
Data Camp Courses:
Introduction to the Tidyverse
Import Data into R (Parts 1 and 2)
Cleaning Data in R
Importing and Cleaning Data in R: Case Studies
Data Manipulation in R using dplyr
Joining Data in R with dplyr