R has a wide variety of data types including scalars, vectors (numerical, character, logical), matrices, dataframes, and lists. Refer to elements of a vector using subscripts. All columns in a matrix must have the same mode(numeric, character, etc.) and the same length. The general format is mymatrix <- matrix(vector, nrow=r, ncol=c, byrow=FALSE, byrow=TRUE indicates that the matrix should be filled by rows. byrow=FALSEindicates that the matrix should be filled by columns (the default). dimnames provides optional labels for the columns and rows. Identify rows, columns or elements using subscripts. Arrays are similar to matrices but can have more than two dimensions. Seehelp(array) for details. A dataframe is more general than a matrix, in that different columns can have different modes (numeric, character, factor, etc.). This is similar to SAS and SPSS datasets. There are a variety of ways to identify the elements of a dataframe . An ordered collection of objects (components). A list allows you to gather a variety of (possibly unrelated) objects under one name. Identify elements of a list using the [[]] convention. Tell R that a variable is nominal by making it a factor. The factor stores the nominal values as a vector of integers in the range [ 1... k ] (where k is the number of unique values in the nominal variable), and an internal vector of character strings (the original values) mapped to these integers. An ordered factor is used to represent an ordinal variable. R will treat factors as nominal variables and ordered factors as ordinal variables in statistical proceedures and graphical analyses. You can use options in the factor( ) andordered( ) functions to control the mapping of integers to strings (overiding the alphabetical ordering). You can also use factors to create value labels. For more on factors see the UCLA page.VECTORS
a <- c(1,2,5.3,6,-2,4) # numeric vector
b <- c("one","two","three") # character vector
c <- c(TRUE,TRUE,TRUE,FALSE,TRUE,FALSE) #logical vectora[c(2,4)] # 2nd and 4th elements of vector
MATRICES
dimnames=list(char_vector_rownames, char_vector_colnames))# generates 5 x 4 numeric matrix
y<-matrix(1:20, nrow=5,ncol=4)
# another example
cells <- c(1,26,24,68)
rnames <- c("R1", "R2")
cnames <- c("C1", "C2")
mymatrix <- matrix(cells, nrow=2, ncol=2, byrow=TRUE,
dimnames=list(rnames, cnames))x[,4] # 4th column of matrix
x[3,] # 3rd row of matrix
x[2:4,1:3] # rows 2,3,4 of columns 1,2,3ARRAYS
DATAFRAMES
d <- c(1,2,3,4)
e <- c("red", "white", "red", NA)
f <- c(TRUE,TRUE,TRUE,FALSE)
mydata <- data.frame(d,e,f)
names(mydata) <- c("ID","Color","Passed") # variable namesmyframe[3:5] # columns 3,4,5 of dataframe
myframe[c("ID","Age")] # columns ID and Age from dataframe
myframe$X1 # variable x1 in the dataframeLISTS
# example of a list with 4 components -
# a string, a numeric vector, a matrix, and a scaler
w <- list(name="Fred", mynumbers=a, mymatrix=y, age=5.3)
# example of a list containing two lists
v <- c(list1,list2)mylist[[2]] # 2nd component of the list
mylist[["mynumbers"]] # component named mynumbers in listFACTORS
# variable gender with 20 "male" entries and
# 30 "female" entries
gender <- c(rep("male",20), rep("female", 30))
gender <- factor(gender)
# stores gender as 20 1s and 30 2s and associates
# 1=female, 2=male internally (alphabetically)
# R now treats gender as a nominal variable
summary(gender)# variable rating coded as "large", "medium", "small'
rating <- ordered(rating)
# recodes rating to 1,2,3 and associates
# 1=large, 2=medium, 3=small internally
# R now treats rating as ordinalUseful Functions
length(object) # number of elements or components
str(object) # structure of an object
class(object) # class or type of an object
names(object) # names
c(object,object,...) # combine objects into a vector
cbind(object, object, ...) # combine objects as columns
rbind(object, object, ...) # combine objects as rows
object # prints the object
ls() # list current objects
rm(object) # delete an object
newobject <- edit(object) # edit copy and save as newobject
fix(object) # edit in place
Monday, November 8, 2010
Data Types in R
Labels:
R
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment