swirl学习之七——Matrices and Data Frames

发表于 2015-05-07 | 分类于技术分享 |


| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

 1: Basic Building Blocks      2: Workspace and Files     
 3: Sequences of Numbers       4: Vectors                 
 5: Missing Values             6: Subsetting Vectors      
 7: Matrices and Data Frames   8: Logic                   
 9: Functions                 10: lapply and sapply       
11: vapply and tapply         12: Looking at Data         
13: Simulation                14: Dates and Times         
15: Base Graphics             

Selection: 7

  |                                                            |   0%

| In this lesson, we'll cover matrices and data frames. Both
| represent 'rectangular' data types, meaning that they are used to
| store tabular data, with rows and columns.

...

  |==                                                          |   3%

| The main difference, as you'll see, is that matrices can only
| contain a single class of data, while data frames can consist of
| many different classes of data.

...

  |===                                                         |   6%

| Let's create a vector containing the numbers 1 through 20 using the
| `:` operator. Store the result in a variable called my_vector.

> my_vector<-1:20

| That's the answer I was looking for.

  |=====                                                       |   9%

| View the contents of the vector you just created.

> my_vector
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

| Excellent job!

  |=======                                                     |  11%

| The dim() function tells us the 'dimensions' of an object. What
| happens if we do dim(my_vector)? Give it a try.

> dim(my_vector)
NULL

| That's correct!

  |=========                                                   |  14%

| Clearly, that's not very helpful! Since my_vector is a vector, it
| doesn't have a `dim` attribute (so it's just NULL), but we can find
| its length using the length() function. Try that now.

> length(my_vector)
[1] 20

| Excellent job!

  |==========                                                  |  17%

| Ah! That's what we wanted. But, what happens if we give my_vector a
| `dim` attribute? Let's give it a try. Type dim(my_vector) <- c(4,
| 5).

> dim(my_vector)<-c(4,5)

| Great job!

  |============                                                |  20%

| It's okay if that last command seemed a little strange to you. It
| should! The dim() function allows you to get OR set the `dim`
| attribute for an R object. In this case, we assigned the value c(4,
| 5) to the `dim` attribute of my_vector.

...

  |==============                                              |  23%

| Use dim(my_vector) to confirm that we've set the `dim` attribute
| correctly.

> dim(my_vector)
[1] 4 5

| All that practice is paying off!

  |===============                                             |  26%

| Another way to see this is by calling the attributes() function on
| my_vector. Try it now.

> attributes(my_vector)
$dim
[1] 4 5


| Keep up the great work!

  |=================                                           |  29%

| Just like in math class, when dealing with a 2-dimensional object
| (think rectangular table), the first number is the number of rows
| and the second is the number of columns. Therefore, we just gave
| my_vector 4 rows and 5 columns.

...

  |===================                                         |  31%

| But, wait! That doesn't sound like a vector any more. Well, it's
| not. Now it's a matrix. View the contents of my_vector now to see
| what it looks like.

> my_vector
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    5    9   13   17
[2,]    2    6   10   14   18
[3,]    3    7   11   15   19
[4,]    4    8   12   16   20

| All that practice is paying off!

  |=====================                                       |  34%

| Now, let's confirm it's actually a matrix by using the class()
| function. Type class(my_vector) to see what I mean.

> class(my_vector)
[1] "matrix"

| You're the best!

  |======================                                      |  37%

| Sure enough, my_vector is now a matrix. We should store it in a new
| variable that helps us remember what it is. Store the value of
| my_vector in a new variable called my_matrix.

> my_matrix<-my_vector

| Keep up the great work!

  |========================                                    |  40%

| The example that we've used so far was meant to illustrate the
| point that a matrix is simply an atomic vector with a dimension
| attribute. A more direct method of creating the same matrix uses
| the matrix() function.

...

  |==========================                                  |  43%

| Bring up the help file for the matrix() function now using the `?`
| function.

> ?matrix

| Perseverance, that's the answer.

  |===========================                                 |  46%

| Now, look at the documentation for the matrix function and see if
| you can figure out how to create a matrix containing the same
| numbers (1-20) and dimensions (4 rows, 5 columns) by calling the
| matrix() function. Store the result in a variable called
| my_matrix2.

> my_matrix2<-matrix(1:20,nrow=4,ncol=5)

| That's a job well done!

  |=============================                               |  49%

| Finally, let's confirm that my_matrix and my_matrix2 are actually
| identical. The identical() function will tell us if its first two
| arguments are the same. Try it out.

> identical(my_matrix,my_matrix2)
[1] TRUE

| You nailed it! Good job!

  |===============================                             |  51%

| Now, imagine that the numbers in our table represent some
| measurements from a clinical experiment, where each row represents
| one patient and each column represents one variable for which
| measurements were taken.

...

  |=================================                           |  54%

| We may want to label the rows, so that we know which numbers belong
| to each patient in the experiment. One way to do this is to add a
| column to the matrix, which contains the names of all four people.

...

  |==================================                          |  57%

| Let's start by creating a character vector containing the names of
| our patients -- Bill, Gina, Kelly, and Sean. Remember that double
| quotes tell R that something is a character string. Store the
| result in a variable called patients.

> patients<-c("Bill","Gina","Kelly","Sean")

| You are doing so well!

  |====================================                        |  60%

| Now we'll use the cbind() function to 'combine columns'. Don't
| worry about storing the result in a new variable. Just call cbind()
| with two arguments -- the patients vector and my_matrix.

> cbind(patients,my_matrix)
     patients                       
[1,] "Bill"   "1" "5" "9"  "13" "17"
[2,] "Gina"   "2" "6" "10" "14" "18"
[3,] "Kelly"  "3" "7" "11" "15" "19"
[4,] "Sean"   "4" "8" "12" "16" "20"

| That's correct!

  |======================================                      |  63%

| Something is fishy about our result! It appears that combining the
| character vector with our matrix of numbers caused everything to be
| enclosed in double quotes. This means we're left with a matrix of
| character strings, which is no good.

...

  |=======================================                     |  66%

| If you remember back to the beginning of this lesson, I told you
| that matrices can only contain ONE class of data. Therefore, when
| we tried to combine a character vector with a numeric matrix, R was
| forced to 'coerce' the numbers to characters, hence the double
| quotes.

...

  |=========================================                   |  69%

| This is called 'implicit coercion', because we didn't ask for it.
| It just happened. But why didn't R just convert the names of our
| patients to numbers? I'll let you ponder that question on your own.

...

  |===========================================                 |  71%

| So, we're still left with the question of how to include the names
| of our patients in the table without destroying the integrity of
| our numeric data. Try the following -- my_data <-
| data.frame(patients, my_matrix)

> my_data<-data.frame(patients,my_matrix)

| You nailed it! Good job!

  |=============================================               |  74%

| Now view the contents of my_data to see what we've come up with.

> my_data
  patients X1 X2 X3 X4 X5
1     Bill  1  5  9 13 17
2     Gina  2  6 10 14 18
3    Kelly  3  7 11 15 19
4     Sean  4  8 12 16 20

| Your dedication is inspiring!

  |==============================================              |  77%

| It looks like the data.frame() function allowed us to store our
| character vector of names right alongside our matrix of numbers.
| That's exactly what we were hoping for!

...

  |================================================            |  80%

| Behind the scenes, the data.frame() function takes any number of
| arguments and returns a single object of class `data.frame` that is
| composed of the original objects.

...

  |==================================================          |  83%

| Let's confirm this by calling the class() function on our newly
| created data frame.

> class(my_data)
[1] "data.frame"

| That's the answer I was looking for.

  |===================================================         |  86%

| It's also possible to assign names to the individual rows and
| columns of a data frame, which presents another possible way of
| determining which row of values in our table belongs to each
| patient.

...

  |=====================================================       |  89%

| However, since we've already solved that problem, let's solve a
| different problem by assigning names to the columns of our data
| frame so that we know what type of measurement each column
| represents.

...

  |=======================================================     |  91%

| Since we have six columns (including patient names), we'll need to
| first create a vector containing one element for each column.
| Create a character vector called cnames that contains the following
| values (in order) -- "patient", "age", "weight", "bp", "rating",
| "test".

> cnames<-c("patient","age","weight","bp","rating","test")

| Nice work!

  |=========================================================   |  94%

| Now, use the colnames() function to set the `colnames` attribute
| for our data frame. This is similar to the way we used the dim()
| function earlier in this lesson.

> colnames(my_data)<-cnames

| You are amazing!

  |==========================================================  |  97%

| Let's see if that got the job done. Print the contents of my_data.

> my_data
  patient age weight bp rating test
1    Bill   1      5  9     13   17
2    Gina   2      6 10     14   18
3   Kelly   3      7 11     15   19
4    Sean   4      8 12     16   20

| You nailed it! Good job!

  |============================================================| 100%

| In this lesson, you learned the basics of working with two very
| important and common data structures -- matrices and data frames.
| There's much more to learn and we'll be covering more advanced
| topics, particularly with respect to data frames, in future
| lessons.

...

| Are you currently enrolled in the Coursera course associated with
| this lesson?

1: Yes
2: No

Selection: 2

| You've reached the end of this lesson! Returning to the main
| menu...

swirl学习之六——Subsetting Vectors

发表于 2015-05-07 | 分类于技术分享 |


| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

 1: Basic Building Blocks      2: Workspace and Files     
 3: Sequences of Numbers       4: Vectors                 
 5: Missing Values             6: Subsetting Vectors      
 7: Matrices and Data Frames   8: Logic                   
 9: Functions                 10: lapply and sapply       
11: vapply and tapply         12: Looking at Data         
13: Simulation                14: Dates and Times         
15: Base Graphics             

Selection: 6

  |                                                        |   0%

| In this lesson, we'll see how to extract elements from a vector
| based on some conditions that we specify.

...

  |=                                                       |   3%

| For example, we may only be interested in the first 20 elements
| of a vector, or only the elements that are not NA, or only
| those that are positive or correspond to a specific variable of
| interest. By the end of this lesson, you'll know how to handle
| each of these scenarios.

...

  |===                                                     |   5%

| I've created for you a vector called x that contains a random
| ordering of 20 numbers (from a standard normal distribution)
| and 20 NAs. Type x now to see what it looks like.

> x
 [1]          NA  1.01612351  0.17390520          NA -0.62466706
 [6]          NA -2.57269671          NA -0.44002462          NA
[11]  0.37101633  0.65818630  1.03885003  0.16175551          NA
[16] -0.32999611          NA          NA          NA  0.40024254
[21]          NA  0.53018587          NA          NA          NA
[26]          NA          NA  0.28211580 -0.04009442          NA
[31]  0.79493463  0.60598426          NA -1.42021598          NA
[36]  0.17550349  0.39153186          NA  1.07989501          NA

| You are really on a roll!

  |====                                                    |   8%

| The way you tell R that you want to select some particular
| elements (i.e. a 'subset') from a vector is by placing an
| 'index vector' in square brackets immediately following the
| name of the vector.

...

  |======                                                  |  11%

| For a simple example, try x[1:10] to view the first ten
| elements of x.

> x[1:10]
 [1]         NA  1.0161235  0.1739052         NA -0.6246671
 [6]         NA -2.5726967         NA -0.4400246         NA

| All that hard work is paying off!

  |=======                                                 |  13%

| Index vectors come in four different flavors -- logical
| vectors, vectors of positive integers, vectors of negative
| integers, and vectors of character strings -- each of which
| we'll cover in this lesson.

...

  |=========                                               |  16%

| Let's start by indexing with logical vectors. One common
| scenario when working with real-world data is that we want to
| extract all elements of a vector that are not NA (i.e. missing
| data). Recall that is.na(x) yields a vector of logical values
| the same length as x, with TRUEs corresponding to NA values in
| x and FALSEs corresponding to non-NA values in x.

...

  |==========                                              |  18%

| What do you think x[is.na(x)] will give you?

1: A vector of length 0
2: A vector of TRUEs and FALSEs
3: A vector with no NAs
4: A vector of all NAs

Selection: 4

| You are amazing!

  |============                                            |  21%

| Prove it to yourself by typing x[is.na(x)].

> x[is.na(x)]
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

| Excellent job!

  |=============                                           |  24%

| Recall that `!` gives us the negation of a logical expression,
| so !is.na(x) can be read as 'is not NA'. Therefore, if we want
| to create a vector called y that contains all of the non-NA
| values from x, we can use y <- x[!is.na(x)]. Give it a try.

> y<-x[!is.na(x)]

| Your dedication is inspiring!

  |===============                                         |  26%

| Print y to the console.

> y
 [1]  1.01612351  0.17390520 -0.62466706 -2.57269671 -0.44002462
 [6]  0.37101633  0.65818630  1.03885003  0.16175551 -0.32999611
[11]  0.40024254  0.53018587  0.28211580 -0.04009442  0.79493463
[16]  0.60598426 -1.42021598  0.17550349  0.39153186  1.07989501

| Keep working like that and you'll get there!

  |================                                        |  29%

| Now that we've isolated the non-missing values of x and put
| them in y, we can subset y as we please.

...

  |==================                                      |  32%

| Recall that the expression y > 0 will give us a vector of
| logical values the same length as y, with TRUEs corresponding
| to values of y that are greater than zero and FALSEs
| corresponding to values of y that are less than or equal to
| zero. What do you think y[y > 0] will give you?

1: A vector of all NAs
2: A vector of all the positive elements of y
3: A vector of length 0
4: A vector of all the negative elements of y
5: A vector of TRUEs and FALSEs

Selection: 2

| All that hard work is paying off!

  |===================                                     |  34%

| Type y[y > 0] to see that we get all of the positive elements
| of y, which are also the positive elements of our original
| vector x.

> y[y>0]
 [1] 1.0161235 0.1739052 0.3710163 0.6581863 1.0388500 0.1617555
 [7] 0.4002425 0.5301859 0.2821158 0.7949346 0.6059843 0.1755035
[13] 0.3915319 1.0798950

| You're the best!

  |=====================                                   |  37%

| You might wonder why we didn't just start with x[x > 0] to
| isolate the positive elements of x. Try that now to see why.

> x[x>0]
 [1]        NA 1.0161235 0.1739052        NA        NA        NA
 [7]        NA 0.3710163 0.6581863 1.0388500 0.1617555        NA
[13]        NA        NA        NA 0.4002425        NA 0.5301859
[19]        NA        NA        NA        NA        NA 0.2821158
[25]        NA 0.7949346 0.6059843        NA        NA 0.1755035
[31] 0.3915319        NA 1.0798950        NA

| Keep up the great work!

  |======================                                  |  39%

| Since NA is not a value, but rather a placeholder for an
| unknown quantity, the expression NA > 0 evaluates to NA. Hence
| we get a bunch of NAs mixed in with our positive numbers when
| we do this.

...

  |========================                                |  42%

| Combining our knowledge of logical operators with our new
| knowledge of subsetting, we could do this -- x[!is.na(x) & x >
| 0]. Try it out.

> x[!is.na(x)&x>0]
 [1] 1.0161235 0.1739052 0.3710163 0.6581863 1.0388500 0.1617555
 [7] 0.4002425 0.5301859 0.2821158 0.7949346 0.6059843 0.1755035
[13] 0.3915319 1.0798950

| You got it right!

  |=========================                               |  45%

| In this case, we request only values of x that are both
| non-missing AND greater than zero.

...

  |===========================                             |  47%

| I've already shown you how to subset just the first ten values
| of x using x[1:10]. In this case, we're providing a vector of
| positive integers inside of the square brackets, which tells R
| to return only the elements of x numbered 1 through 10.

...

  |============================                            |  50%

| Many programming languages use what's called 'zero-based
| indexing', which means that the first element of a vector is
| considered element 0. R uses 'one-based indexing', which (you
| guessed it!) means the first element of a vector is considered
| element 1.

...

  |=============================                           |  53%

| Can you figure out how we'd subset the 3rd, 5th, and 7th
| elements of x? Hint -- Use the c() function to specify the
| element numbers as a numeric vector.

> x[c(3,5,7)]
[1]  0.1739052 -0.6246671 -2.5726967

| Keep up the great work!

  |===============================                         |  55%

| It's important that when using integer vectors to subset our
| vector x, we stick with the set of indexes {1, 2, ..., 40}
| since x only has 40 elements. What happens if we ask for the
| zeroth element of x (i.e. x[0])? Give it a try.

> x[0]
numeric(0)

| You're the best!

  |================================                        |  58%

| As you might expect, we get nothing useful. Unfortunately, R
| doesn't prevent us from doing this. What if we ask for the
| 3000th element of x? Try it out.

> x[3000]
[1] NA

| Great job!

  |==================================                      |  61%

| Again, nothing useful, but R doesn't prevent us from asking for
| it. This should be a cautionary tale. You should always make
| sure that what you are asking for is within the bounds of the
| vector you're working with.

...

  |===================================                     |  63%

| What if we're interested in all elements of x EXCEPT the 2nd
| and 10th? It would be pretty tedious to construct a vector
| containing all numbers 1 through 40 EXCEPT 2 and 10.

...

  |=====================================                   |  66%

| Luckily, R accepts negative integer indexes. Whereas x[c(2,
| 10)] gives us ONLY the 2nd and 10th elements of x, x[c(-2,
| -10)] gives us all elements of x EXCEPT for the 2nd and 10
| elements.  Try x[c(-2, -10)] now to see this.

> x[c(-2,-10)]
 [1]          NA  0.17390520          NA -0.62466706          NA
 [6] -2.57269671          NA -0.44002462  0.37101633  0.65818630
[11]  1.03885003  0.16175551          NA -0.32999611          NA
[16]          NA          NA  0.40024254          NA  0.53018587
[21]          NA          NA          NA          NA          NA
[26]  0.28211580 -0.04009442          NA  0.79493463  0.60598426
[31]          NA -1.42021598          NA  0.17550349  0.39153186
[36]          NA  1.07989501          NA

| Excellent job!

  |======================================                  |  68%

| A shorthand way of specifying multiple negative numbers is to
| put the negative sign out in front of the vector of positive
| numbers. Type x[-c(2, 10)] to get the exact same result.

> x[-c(2,10)]
 [1]          NA  0.17390520          NA -0.62466706          NA
 [6] -2.57269671          NA -0.44002462  0.37101633  0.65818630
[11]  1.03885003  0.16175551          NA -0.32999611          NA
[16]          NA          NA  0.40024254          NA  0.53018587
[21]          NA          NA          NA          NA          NA
[26]  0.28211580 -0.04009442          NA  0.79493463  0.60598426
[31]          NA -1.42021598          NA  0.17550349  0.39153186
[36]          NA  1.07989501          NA

| All that hard work is paying off!

  |========================================                |  71%

| So far, we've covered three types of index vectors -- logical,
| positive integer, and negative integer. The only remaining type
| requires us to introduce the concept of 'named' elements.

...

  |=========================================               |  74%

| Create a numeric vector with three named elements using vect <-
| c(foo = 11, bar = 2, norf = NA).

> vect<-c(foo=11,bar=2,norf=NA)

| Great job!

  |===========================================             |  76%

| When we print vect to the console, you'll see that each element
| has a name. Try it out.

> vect
 foo  bar norf 
  11    2   NA 

| You are amazing!

  |============================================            |  79%

| We can also get the names of vect by passing vect as an
| argument to the names() function. Give that a try.

> names(vect)
[1] "foo"  "bar"  "norf"

| You are quite good my friend!

  |==============================================          |  82%

| Alternatively, we can create an unnamed vector vect2 with c(11,
| 2, NA). Do that now.

> vect2<-c(11,2,NA)

| That's the answer I was looking for.

  |===============================================         |  84%

| Then, we can add the `names` attribute to vect2 after the fact
| with names(vect2) <- c("foo", "bar", "norf"). Go ahead.

> names(vect2)<-c("foo","bar","norf")

| You got it right!

  |=================================================       |  87%

| Now, let's check that vect and vect2 are the same by passing
| them as arguments to the identical() function.

> identical(vect,vect2)
[1] TRUE

| Great job!

  |==================================================      |  89%

| Indeed, vect and vect2 are identical named vectors.

...

  |====================================================    |  92%

| Now, back to the matter of subsetting a vector by named
| elements. Which of the following commands do you think would
| give us the second element of vect?

1: vect["bar"]
2: vect["2"]
3: vect[bar]

Selection: 1

| You are really on a roll!

  |=====================================================   |  95%

| Now, try it out.

> vect["bar"]
bar 
  2 

| All that hard work is paying off!

  |======================================================= |  97%

| Likewise, we can specify a vector of names with vect[c("foo",
| "bar")]. Try it out.

> vect[c("foo","bar")]
foo bar 
 11   2 

| You are doing so well!

  |========================================================| 100%

| Now you know all four methods of subsetting data from vectors.
| Different approaches are best in different scenarios and when
| in doubt, try it out!

...

| Are you currently enrolled in the Coursera course associated
| with this lesson?

1: Yes
2: No

Selection: 2

| You've reached the end of this lesson! Returning to the main
| menu...

swirl学习之五——Missing Values

发表于 2015-05-07 | 分类于技术分享 |


| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

 1: Basic Building Blocks      2: Workspace and Files     
 3: Sequences of Numbers       4: Vectors                 
 5: Missing Values             6: Subsetting Vectors      
 7: Matrices and Data Frames   8: Logic                   
 9: Functions                 10: lapply and sapply       
11: vapply and tapply         12: Looking at Data         
13: Simulation                14: Dates and Times         
15: Base Graphics             

Selection: 5

  |                                                        |   0%

| Missing values play an important role in statistics and data
| analysis. Often, missing values must not be ignored, but rather
| they should be carefully studied to see if there's an
| underlying pattern or cause for their missingness.

...

  |===                                                     |   5%

| In R, NA is used to represent any value that is 'not available'
| or 'missing' (in the statistical sense). In this lesson, we'll
| explore missing values further.

...

  |======                                                  |  11%

| Any operation involving NA generally yields NA as the result.
| To illustrate, let's create a vector c(44, NA, 5, NA) and
| assign it to a variable x.

> x<-c(44,NA,5,NA)

| Perseverance, that's the answer.

  |=========                                               |  16%

| Now, let's multiply x by 3.

> x*3
[1] 132  NA  15  NA

| That's a job well done!

  |============                                            |  21%

| Notice that the elements of the resulting vector that
| correspond with the NA values in x are also NA.

...

  |===============                                         |  26%

| To make things a little more interesting, lets create a vector
| containing 1000 draws from a standard normal distribution with
| y <- rnorm(1000).

> y<-rnorm(1000)

| That's the answer I was looking for.

  |==================                                      |  32%

| Next, let's create a vector containing 1000 NAs with z <-
| rep(NA, 1000).

> z<-rep(NA,1000)

| You nailed it! Good job!

  |=====================                                   |  37%

| Finally, let's select 100 elements at random from these 2000
| values (combining y and z) such that we don't know how many NAs
| we'll wind up with or what positions they'll occupy in our
| final vector -- my_data <- sample(c(y, z), 100).

> my_data<-sample(c(y,z),100)

| You are doing so well!

  |========================                                |  42%

| Let's first ask the question of where our NAs are located in
| our data. The is.na() function tells us whether each element of
| a vector is NA. Call is.na() on my_data and assign the result
| to my_na.

> my_na<-is.na(my_data)

| All that practice is paying off!

  |===========================                             |  47%

| Now, print my_na to see what you came up with.

> my_na
  [1]  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE FALSE FALSE  TRUE
 [11]  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE
 [21] FALSE  TRUE FALSE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE
 [31]  TRUE FALSE  TRUE  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE
 [41] FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE
 [51] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE
 [61] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 [71]  TRUE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE  TRUE
 [81] FALSE  TRUE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
 [91] FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE

| All that hard work is paying off!

  |=============================                           |  53%

| Everywhere you see a TRUE, you know the corresponding element
| of my_data is NA. Likewise, everywhere you see a FALSE, you
| know the corresponding element of my_data is one of our random
| draws from the standard normal distribution.

...

  |================================                        |  58%

| In our previous discussion of logical operators, we introduced
| the `==` operator as a method of testing for equality between
| two objects. So, you might think the expression my_data == NA
| yields the same results as is.na(). Give it a try.

> my_data==NA
  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
 [21] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
 [41] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
 [61] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
 [81] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

| Keep up the great work!

  |===================================                     |  63%

| The reason you got a vector of all NAs is that NA is not really
| a value, but just a placeholder for a quantity that is not
| available. Therefore the logical expression is incomplete and R
| has no choice but to return a vector of the same length as
| my_data that contains all NAs.

...

  |======================================                  |  68%

| Don't worry if that's a little confusing. The key takeaway is
| to be cautious when using logical expressions anytime NAs might
| creep in, since a single NA value can derail the entire thing.

...

  |=========================================               |  74%

| So, back to the task at hand. Now that we have a vector, my_na,
| that has a TRUE for every NA and FALSE for every numeric value,
| we can compute the total number of NAs in our data.

...

  |============================================            |  79%

| The trick is to recognize that underneath the surface, R
| represents TRUE as the number 1 and FALSE as the number 0.
| Therefore, if we take the sum of a bunch of TRUEs and FALSEs,
| we get the total number of TRUEs.

...

  |===============================================         |  84%

| Let's give that a try here. Call the sum() function on my_na to
| count the total number of TRUEs in my_na, and thus the total
| number of NAs in my_data. Don't assign the result to a new
| variable.

> sum(my_na)
[1] 53

| All that practice is paying off!

  |==================================================      |  89%

| Pretty cool, huh? Finally, let's take a look at the data to
| convince ourselves that everything 'adds up'. Print my_data to
| the console.

> my_data
  [1]           NA           NA           NA           NA
  [5]  0.124769189           NA           NA  0.692392963
  [9] -1.746465523           NA           NA -0.821663967
 [13]           NA -0.580694318 -1.511836462  0.081071870
 [17]           NA           NA           NA  1.097226579
 [21] -3.126426132           NA -1.199908058 -0.794525073
 [25]           NA -0.443946101           NA           NA
 [29]           NA           NA           NA  0.742624944
 [33]           NA           NA -1.634124579           NA
 [37] -0.850173971  0.441734720  0.513475081           NA
 [41] -0.368936480 -1.357784834           NA           NA
 [45]  0.007424283 -1.258690752  0.779107391 -1.419960183
 [49]           NA -0.763940473  0.450923280           NA
 [53]           NA           NA           NA           NA
 [57]  0.925643135 -0.003863920           NA           NA
 [61] -0.062849926 -1.557277905           NA           NA
 [65]           NA           NA           NA           NA
 [69]           NA           NA           NA -0.284868951
 [73]           NA           NA  0.056676275  0.240678898
 [77]           NA           NA -0.432834665           NA
 [81]  0.784445940           NA           NA -1.192080644
 [85]           NA  0.768473262 -0.170659651 -1.795948523
 [89]  1.249158629 -0.723159498 -0.460614065  0.238104108
 [93] -1.025906852           NA           NA           NA
 [97]  0.982965761 -0.084049625 -0.102720652  0.552020816

| You are really on a roll!

  |=====================================================   |  95%

| Now that we've got NAs down pat, let's look at a second type of
| missing value -- NaN, which stands for 'not a number'. To
| generate NaN, try dividing (using a forward slash) 0 by 0 now.

> 0/0
[1] NaN

| You are amazing!

  |========================================================| 100%

| Let's do one more, just for fun. In R, Inf stands for infinity.
| What happens if you subtract Inf from Inf?

> Inf-Inf
[1] NaN

| You are amazing!

| Are you currently enrolled in the Coursera course associated
| with this lesson?

1: Yes
2: No

Selection: 2

| You've reached the end of this lesson! Returning to the main
| menu...

swirl学习之四——Vectors

发表于 2015-05-07 | 分类于技术分享 |


| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

 1: Basic Building Blocks      2: Workspace and Files     
 3: Sequences of Numbers       4: Vectors                 
 5: Missing Values             6: Subsetting Vectors      
 7: Matrices and Data Frames   8: Logic                   
 9: Functions                 10: lapply and sapply       
11: vapply and tapply         12: Looking at Data         
13: Simulation                14: Dates and Times         
15: Base Graphics             

Selection: 4

  |                                                        |   0%

| The simplest and most common data structure in R is the vector.

...

  |==                                                      |   3%

| Vectors come in two different flavors: atomic vectors and
| lists. An atomic vector contains exactly one data type, whereas
| a list may contain multiple data types. We'll explore atomic
| vectors further before we get to lists.

...

  |===                                                     |   5%

| In previous lessons, we dealt entirely with numeric vectors,
| which are one type of atomic vector. Other types of atomic
| vectors include logical, character, integer, and complex. In
| this lesson, we'll take a closer look at logical and character
| vectors.

...

  |=====                                                   |   8%

| Logical vectors can contain the values TRUE, FALSE, and NA (for
| 'not available'). These values are generated as the result of
| logical 'conditions'. Let's experiment with some simple
| conditions.

...

  |======                                                  |  11%

| First, create a numeric vector num_vect that contains the
| values 0.5, 55, -10, and 6.

> num_vect<-c(0.5,55,-10,6)

| Excellent work!

  |========                                                |  14%

| Now, create a variable called tf that gets the result of
| num_vect < 1, which is read as 'num_vect is less than 1'.

> tf<-num_vect<1

| You're the best!

  |=========                                               |  16%

| What do you think tf will look like?

1: a vector of 4 logical values
2: a single logical value

Selection: 1

| You are amazing!

  |===========                                             |  19%

| Print the contents of tf now.

> tf
[1]  TRUE FALSE  TRUE FALSE

| That's a job well done!

  |============                                            |  22%

| The statement num_vect < 1 is a condition and tf tells us
| whether each corresponding element of our numeric vector
| num_vect satisfies this condition.

...

  |==============                                          |  24%

| The first element of num_vect is 0.5, which is less than 1 and
| therefore the statement 0.5 < 1 is TRUE. The second element of
| num_vect is 55, which is greater than 1, so the statement 55 <
| 1 is FALSE. The same logic applies for the third and fourth
| elements.

...

  |===============                                         |  27%

| Let's try another. Type num_vect >= 6 without assigning the
| result to a new variable.

> num_vect>=6
[1] FALSE  TRUE FALSE  TRUE

| Nice work!

  |=================                                       |  30%

| This time, we are asking whether each individual element of
| num_vect is greater than OR equal to 6. Since only 55 and 6 are
| greater than or equal to 6, the second and fourth elements of
| the result are TRUE and the first and third elements are FALSE.

...

  |==================                                      |  32%

| The `<` and `>=` symbols in these examples are called 'logical
| operators'. Other logical operators include `>`, `<=`, `==` for
| exact equality, and `!=` for inequality.

...

  |====================                                    |  35%

| If we have two logical expressions, A and B, we can ask whether
| at least one is TRUE with A | B (logical 'or' a.k.a. 'union')
| or whether they are both TRUE with A & B (logical 'and' a.k.a.
| 'intersection'). Lastly, !A is the negation of A and is TRUE
| when A is FALSE and vice versa.

...

  |=====================                                   |  38%

| It's a good idea to spend some time playing around with various
| combinations of these logical operators until you get
| comfortable with their use. We'll do a few examples here to get
| you started.

...

  |=======================                                 |  41%

| Try your best to predict the result of each of the following
| statements. You can use pencil and paper to work them out if
| it's helpful. If you get stuck, just guess and you've got a 50%
| chance of getting the right answer!

...

  |========================                                |  43%

| (3 > 5) & (4 == 4)

1: FALSE
2: TRUE

Selection: 1

| Keep working like that and you'll get there!

  |==========================                              |  46%

| (TRUE == TRUE) | (TRUE == FALSE)

1: FALSE
2: TRUE

Selection: 2

| Perseverance, that's the answer.

  |===========================                             |  49%

| ((111 >= 111) | !(TRUE)) & ((4 + 1) == 5)

1: FALSE
2: TRUE

Selection: 2

| You got it right!

  |=============================                           |  51%

| Don't worry if you found these to be tricky. They're supposed
| to be. Working with logical statements in R takes practice, but
| your efforts will be rewarded in future lessons (e.g.
| subsetting and control structures).

...

  |==============================                          |  54%

| Character vectors are also very common in R. Double quotes are
| used to distinguish character objects, as in the following
| example.

...

  |================================                        |  57%

| Create a character vector that contains the following words:
| "My", "name", "is". Remember to enclose each word in its own
| set of double quotes, so that R knows they are character
| strings. Store the vector in a variable called my_char.

> my_char<-c("My","name","is")

| Keep up the great work!

  |=================================                       |  59%

| Print the contents of my_char to see what it looks like.

> my_char
[1] "My"   "name" "is"  

| You got it!

  |===================================                     |  62%

| Right now, my_char is a character vector of length 3. Let's say
| we want to join the elements of my_char together into one
| continuous character string (i.e. a character vector of length
| 1). We can do this using the paste() function.

...

  |====================================                    |  65%

| Type paste(my_char, collapse = " ") now. Make sure there's a
| space between the double quotes in the `collapse` argument.
| You'll see why in a second.

> paste(my_char,collapse = " ")
[1] "My name is"

| Excellent work!

  |======================================                  |  68%

| The `collapse` argument to the paste() function tells R that
| when we join together the elements of the my_char character
| vector, we'd like to separate them with single spaces.

...

  |=======================================                 |  70%

| It seems that we're missing something.... Ah, yes! Your name!

...

  |=========================================               |  73%

| To add (or 'concatenate') your name to the end of my_char, use
| the c() function like this: c(my_char, "your_name_here"). Place
| your name in double quotes where I've put "your_name_here". Try
| it now, storing the result in a new variable called my_name.

> my_name<-c(my_char,"Peter")

| You are doing so well!

  |==========================================              |  76%

| Take a look at the contents of my_name.

> my_name
[1] "My"    "name"  "is"    "Peter"

| Your dedication is inspiring!

  |============================================            |  78%

| Now, use the paste() function once more to join the words in
| my_name together into a single character string. Don't forget
| to say collapse = " "!

> paste(my_name,collapse = " ")
[1] "My name is Peter"

| Great job!

  |=============================================           |  81%

| In this example, we used the paste() function to collapse the
| elements of a single character vector. paste() can also be used
| to join the elements of multiple character vectors.

...

  |===============================================         |  84%

| In the simplest case, we can join two character vectors that
| are each of length 1 (i.e. join two words). Try paste("Hello",
| "world!", sep = " "), where the `sep` argument tells R that we
| want to separate the joined elements with a single space.

> paste("Hello","world!",sep=" ")
[1] "Hello world!"

| Keep up the great work!

  |================================================        |  86%

| For a slightly more complicated example, we can join two
| vectors, each of length 3. Use paste() to join the integer
| vector 1:3 with the character vector c("X", "Y", "Z"). This
| time, use sep = "" to leave no space between the joined
| elements.

> paste(1:3,c("X","Y","Z"),sep="")
[1] "1X" "2Y" "3Z"

| You're the best!

  |==================================================      |  89%

| What do you think will happen if our vectors are of different
| length? (Hint: we talked about this in a previous lesson.)

...

  |===================================================     |  92%

| Vector recycling! Try paste(LETTERS, 1:4, sep = "-"), where
| LETTERS is a predefined variable in R containing a character
| vector of all 26 letters in the English alphabet.

> paste(LETTERS,1:4,sep="-")
 [1] "A-1" "B-2" "C-3" "D-4" "E-1" "F-2" "G-3" "H-4" "I-1" "J-2"
[11] "K-3" "L-4" "M-1" "N-2" "O-3" "P-4" "Q-1" "R-2" "S-3" "T-4"
[21] "U-1" "V-2" "W-3" "X-4" "Y-1" "Z-2"

| Keep up the great work!

  |=====================================================   |  95%

| Since the character vector LETTERS is longer than the numeric
| vector 1:4, R simply recycles, or repeats, 1:4 until it matches
| the length of LETTERS.

...

  |======================================================  |  97%

| Also worth noting is that the numeric vector 1:4 gets 'coerced'
| into a character vector by the paste() function.

...

  |========================================================| 100%

| We'll discuss coercion in another lesson, but all it really
| means that the numbers 1, 2, 3, and 4 in the output above are
| no longer numbers to R, but rather characters "1", "2", "3",
| and "4".

...

| Are you currently enrolled in the Coursera course associated
| with this lesson?

1: Yes
2: No

Selection: 2

| You've reached the end of this lesson! Returning to the main
| menu...

swirl学习之三——Sequences of Numbers

发表于 2015-05-07 | 分类于技术分享 |


| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

 1: Basic Building Blocks      2: Workspace and Files     
 3: Sequences of Numbers       4: Vectors                 
 5: Missing Values             6: Subsetting Vectors      
 7: Matrices and Data Frames   8: Logic                   
 9: Functions                 10: lapply and sapply       
11: vapply and tapply         12: Looking at Data         
13: Simulation                14: Dates and Times         
15: Base Graphics             

Selection: 3

  |                                                        |   0%

| In this lesson, you'll learn how to create sequences of numbers
| in R.

...

  |===                                                     |   5%

| The simplest way to create a sequence of numbers in R is by
| using the `:` operator. Type 1:20 to see how it works.

> 1:20
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

| Perseverance, that's the answer.

  |=====                                                   |   9%

| That gave us every integer between (and including) 1 and 20. We
| could also use it to create a sequence of real numbers. For
| example, try pi:10.

> pi:10
[1] 3.141593 4.141593 5.141593 6.141593 7.141593 8.141593 9.141593

| Perseverance, that's the answer.

  |========                                                |  14%

| The result is a vector of real numbers starting with pi
| (3.142...) and increasing in increments of 1. The upper limit
| of 10 is never reached, since the next number in our sequence
| would be greater than 10.

...

  |==========                                              |  18%

| What happens if we do 15:1? Give it a try to find out.

> 15:1
 [1] 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1

| Excellent job!

  |=============                                           |  23%

| It counted backwards in increments of 1! It's unlikely we'd
| want this behavior, but nonetheless it's good to know how it
| could happen.

...

  |===============                                         |  27%

| Remember that if you have questions about a particular R
| function, you can access its documentation with a question mark
| followed by the function name: ?function_name_here. However, in
| the case of an operator like the colon used above, you must
| enclose the symbol in backticks like this: ?`:`. (NOTE: The
| backtick (`) key is generally located in the top left corner of
| a keyboard, above the Tab key. If you don't have a backtick
| key, you can use regular quotes.)

...

  |==================                                      |  32%

| Pull up the documentation for `:` now.

> ?`:`

| That's a job well done!

  |====================                                    |  36%

| Often, we'll desire more control over a sequence we're creating
| than what the `:` operator gives us. The seq() function serves
| this purpose.

...

  |=======================                                 |  41%

| The most basic use of seq() does exactly the same thing as the
| `:` operator. Try seq(1, 20) to see this.

> seq(1,20)
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

| That's correct!

  |=========================                               |  45%

| This gives us the same output as 1:20. However, let's say that
| instead we want a vector of numbers ranging from 0 to 10,
| incremented by 0.5. seq(0, 10, by=0.5) does just that. Try it
| out.

> seq(0,10,by=0.5)
 [1]  0.0  0.5  1.0  1.5  2.0  2.5  3.0  3.5  4.0  4.5  5.0  5.5
[13]  6.0  6.5  7.0  7.5  8.0  8.5  9.0  9.5 10.0

| You are doing so well!

  |============================                            |  50%

| Or maybe we don't care what the increment is and we just want a
| sequence of 30 numbers between 5 and 10. seq(5, 10, length=30)
| does the trick. Give it a shot now and store the result in a
| new variable called my_seq.

> my_seq<-seq(5,10,length=30)

| You are quite good my friend!

  |===============================                         |  55%

| To confirm that my_seq has length 30, we can use the length()
| function. Try it now.

> length(my_seq)
[1] 30

| Perseverance, that's the answer.

  |=================================                       |  59%

| Let's pretend we don't know the length of my_seq, but we want
| to generate a sequence of integers from 1 to N, where N
| represents the length of the my_seq vector. In other words, we
| want a new vector (1, 2, 3, ...) that is the same length as
| my_seq.

...

  |====================================                    |  64%

| There are several ways we could do this. One possibility is to
| combine the `:` operator and the length() function like this:
| 1:length(my_seq). Give that a try.

> 1:length(my_seq)
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
[21] 21 22 23 24 25 26 27 28 29 30

| You got it!

  |======================================                  |  68%

| Another option is to use seq(along.with = my_seq). Give that a
| try.

> seq(along.with=my_seq)
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
[21] 21 22 23 24 25 26 27 28 29 30

| Keep working like that and you'll get there!

  |=========================================               |  73%

| However, as is the case with many common tasks, R has a
| separate built-in function for this purpose called seq_along().
| Type seq_along(my_seq) to see it in action.

> seq_along(my_seq)
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
[21] 21 22 23 24 25 26 27 28 29 30

| You are doing so well!

  |===========================================             |  77%

| There are often several approaches to solving the same problem,
| particularly in R. Simple approaches that involve less typing
| are generally best. It's also important for your code to be
| readable, so that you and others can figure out what's going on
| without too much hassle.

...

  |==============================================          |  82%

| If R has a built-in function for a particular task, it's likely
| that function is highly optimized for that purpose and is your
| best option. As you become a more advanced R programmer, you'll
| design your own functions to perform tasks when there are no
| better options. We'll explore writing your own functions in
| future lessons.

...

  |================================================        |  86%

| One more function related to creating sequences of numbers is
| rep(), which stands for 'replicate'. Let's look at a few uses.

...

  |===================================================     |  91%

| If we're interested in creating a vector that contains 40
| zeros, we can use rep(0, times = 40). Try it out.

> rep(0,times=40)
 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[32] 0 0 0 0 0 0 0 0 0

| Keep up the great work!

  |=====================================================   |  95%

| If instead we want our vector to contain 10 repetitions of the
| vector (0, 1, 2), we can do rep(c(0, 1, 2), times = 10). Go
| ahead.

> rep(c(0,1,2),times=10)
 [1] 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2

| Excellent job!

  |========================================================| 100%

| Finally, let's say that rather than repeating the vector (0, 1,
| 2) over and over again, we want our vector to contain 10 zeros,
| then 10 ones, then 10 twos. We can do this with the `each`
| argument. Try rep(c(0, 1, 2), each = 10).

> rep(c(0,1,2),each=10)
 [1] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2

| Keep working like that and you'll get there!

| Are you currently enrolled in the Coursera course associated
| with this lesson?

1: Yes
2: No

Selection: 2

| You've reached the end of this lesson! Returning to the main
| menu...

swirl学习之二——Workspace and Files

发表于 2015-05-07 | 分类于技术分享 |


| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

 1: Basic Building Blocks      2: Workspace and Files        3: Sequences of Numbers    
 4: Vectors                    5: Missing Values             6: Subsetting Vectors      
 7: Matrices and Data Frames   8: Logic                      9: Functions               
10: lapply and sapply         11: vapply and tapply         12: Looking at Data         
13: Simulation                14: Dates and Times           15: Base Graphics           


Selection: 2

  |                                                                                      |   0%

| In this lesson, you'll learn how to examine your local workspace in R and begin to explore
| the relationship between your workspace and the file system of your machine.

...

  |==                                                                                    |   2%

| Because different operating systems have different conventions with regards to things like
| file paths, the outputs of these commands may vary across machines.

...

  |====                                                                                  |   5%

| However it's important to note that R provides a common API (a common set of commands) for
| interacting with files, that way your code will work across different kinds of computers.

...

  |======                                                                                |   7%

| Let's jump right in so you can get a feel for how these special functions work!

...

  |========                                                                              |  10%

| Determine which directory your R session is using as its current working directory using
| getwd().

> getwd()
[1] "/Users/xiaomodepro/DataAnalysis"

| Keep up the great work!

  |==========                                                                            |  12%

| List all the objects in your local workspace using ls().

> ls()
character(0)

| You are quite good my friend!

  |=============                                                                         |  15%

| Some R commands are the same as their equivalents commands on Linux or on a Mac. Both
| Linux and Mac operating systems are based on an operating system called Unix. It's always
| a good idea to learn more about Unix!

...

  |===============                                                                       |  17%

| Assign 9 to x using x <- 9.

> x<-9

| You are really on a roll!

  |=================                                                                     |  20%

| Now take a look at objects that are in your workspace using ls().

> ls()
[1] "x"

| Keep working like that and you'll get there!

  |===================                                                                   |  22%

| List all the files in your working directory using list.files() or dir().

> list.files()
 [1] "Anscombe.R"                          "case_Anscombe.R"                                      

| Nice work!

  |=====================                                                                 |  24%

| As we go through this lesson, you should be examining the help page for each new function.
| Check out the help page for list.files with the command ?list.files.

> ?list.files

| Keep working like that and you'll get there!

  |=======================                                                               |  27%

| One of the most helpful parts of any R help file is the See Also section. Read that
| section for list.files. Some of these functions may be used in later portions of this
| lesson.

...

  |=========================                                                             |  29%

| Using the args() function on a function name is also a handy way to see what arguments a
| function can take.

...

  |===========================                                                           |  32%

| Use the args() function to determine the arguments to list.files().

> args(list.files)
function (path = ".", pattern = NULL, all.files = FALSE, full.names = FALSE, 
    recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, 
    no.. = FALSE) 
NULL

| You got it right!

  |=============================                                                         |  34%

| Assign the value of the current working directory to a variable called "old.dir".

> old.dir<-getwd()

| You're the best!

  |===============================                                                       |  37%

| We will use old.dir at the end of this lesson to move back to the place that we started. A
| lot of query functions like getwd() have the useful property that they return the answer
| to the question as a result of the function.

...

  |==================================                                                    |  39%

| Use dir.create() to create a directory in the current working directory called "testdir".

> dir.create("testdir")

| Excellent job!

  |====================================                                                  |  41%

| We will do all our work in this new directory and then delete it after we are done. This
| is the R analog to "Take only pictures, leave only footprints."

...

  |======================================                                                |  44%

| Set your working directory to "testdir" with the setwd() command.

> setwd("testdir")

| Excellent job!

  |========================================                                              |  46%

| In general, you will want your working directory to be someplace sensible, perhaps created
| for the specific project that you are working on. In fact, organizing your work in R
| packages using RStudio is an excellent option. Check out RStudio at
| http://www.rstudio.com/

...

  |==========================================                                            |  49%

| Create a file in your working directory called "mytest.R" using the file.create()
| function.

> file.create("mytest.R")
[1] TRUE

| You got it!

  |============================================                                          |  51%

| This should be the only file in this newly created directory. Let's check this by listing
| all the files in the current directory.

> list.files()
[1] "mytest.R"

| Nice work!

  |==============================================                                        |  54%

| Check to see if "mytest.R" exists in the working directory using the file.exists()
| function.

> file.exists("mytest.R")
[1] TRUE

| All that hard work is paying off!

  |================================================                                      |  56%

| These sorts of functions are excessive for interactive use. But, if you are running a
| program that loops through a series of files and does some processing on each one, you
| will want to check to see that each exists before you try to process it.

...

  |==================================================                                    |  59%

| Access information about the file "mytest.R" by using file.info().

> file.info("mytest.R")
         size isdir mode               mtime               ctime               atime uid gid
mytest.R    0 FALSE  644 2015-05-08 00:36:33 2015-05-08 00:36:33 2015-05-08 00:36:33 501  20
             uname grname
mytest.R xmuxiaomo  staff

| That's the answer I was looking for.

  |====================================================                                  |  61%

| You can use the $ operator --- e.g., file.info("mytest.R")$mode --- to grab specific
| items.

...

  |=======================================================                               |  63%

| Change the name of the file "mytest.R" to "mytest2.R" by using file.rename().

> file.rename("mytest.R","mytest2.R")
[1] TRUE

| You got it right!

  |=========================================================                             |  66%

| Your operating system will provide simpler tools for these sorts of tasks, but having the
| ability to manipulate files programatically is useful. You might now try to delete
| mytest.R using file.remove('mytest.R'), but that won't work since mytest.R no longer
| exists. You have already renamed it.

...

  |===========================================================                           |  68%

| Make a copy of "mytest2.R" called "mytest3.R" using file.copy().

> file.copy("mytest2.R","mytest3.R")
[1] TRUE

| That's the answer I was looking for.

  |=============================================================                         |  71%

| You now have two files in the current directory. That may not seem very interesting. But
| what if you were working with dozens, or millions, of individual files? In that case,
| being able to programatically act on many files would be absolutely necessary. Don't
| forget that you can, temporarily, leave the lesson by typing play() and then return by
| typing nxt().

...

  |===============================================================                       |  73%

| Provide the relative path to the file "mytest3.R" by using file.path().

> file.path("mytest3.R")
[1] "mytest3.R"

| Perseverance, that's the answer.

  |=================================================================                     |  76%

| You can use file.path to construct file and directory paths that are independent of the
| operating system your R code is running on. Pass 'folder1' and 'folder2' as arguments to
| file.path to make a platform-independent pathname.

> file.path("folder1","folder2")
[1] "folder1/folder2"

| All that practice is paying off!

  |===================================================================                   |  78%

| Take a look at the documentation for dir.create by entering ?dir.create . Notice the
| 'recursive' argument. In order to create nested directories, 'recursive' must be set to
| TRUE.

> ?dir.create

| You are amazing!

  |=====================================================================                 |  80%

| Create a directory in the current working directory called "testdir2" and a subdirectory
| for it called "testdir3", all in one command by using dir.create() and file.path().

> dir.create(file.path("testdir2","testdir3"),recursive=TRUE)

| Excellent job!

  |=======================================================================               |  83%

| To delete a directory you need to use the recursive = TRUE argument with the function
| unlink(). If you don't use recursive = TRUE, R is concerned that you're unaware that
| you're deleting a directory and all of its contents. R reasons that, if you don't specify
| that recursive equals TRUE, you don't know that something is in the directory you're
| trying to delete. R tries to prevent you from making a mistake.

...

  |=========================================================================             |  85%

| Delete the "testdir2" directory that you created by using unlink().

> unlink("testdir2",recursive=TRUE)

| You got it!

  |============================================================================          |  88%

| Why is this command named "unlink" rather than something more sensible like "dir.delete"
| or "dir.remove"? Mainly, history. unlink is the traditional Unix command for removing
| directories.

...

  |==============================================================================        |  90%

| Go back to your original working directory using setwd(). (Recall that we created the
| variable old.dir with the full path for the orginal working directory at the start of
| these questions.)

> setwd(old.dir)

| Keep up the great work!

  |================================================================================      |  93%

| It is often helpful to save the settings that you had before you began an analysis and
| then go back to them at the end. This trick is often used within functions; you save, say,
| the par() settings that you started with, mess around a bunch, and then set them back to
| the original values at the end. This isn't the same as what we have done here, but it
| seems similar enough to mention.

...

  |==================================================================================    |  95%

| Delete the 'testdir' directory that you just left (and everything in it)

> unlink("testdir",recursive=TRUE)

| That's a job well done!

  |====================================================================================  |  98%

| Take nothing but results. Leave nothing but assumptions. That sounds like 'Take nothing
| but pictures. Leave nothing but footprints.' But it makes no sense! Surely our readers can
| come up with a better motto . . .

...

  |======================================================================================| 100%

| In this lesson, you learned how to examine your R workspace and work with the file system
| of your machine from within R. Thanks for playing!

...

| Are you currently enrolled in the Coursera course associated with this lesson?

1: Yes
2: No

Selection: 2

| You've reached the end of this lesson! Returning to the main menu...

swirl学习之一——Basic Building Blocks

发表于 2015-05-06 | 分类于技术分享 |


| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

 1: Basic Building Blocks      2: Workspace and Files     
 3: Sequences of Numbers       4: Vectors                 
 5: Missing Values             6: Subsetting Vectors      
 7: Matrices and Data Frames   8: Logic                   
 9: Functions                 10: lapply and sapply       
11: vapply and tapply         12: Looking at Data         
13: Simulation                14: Dates and Times         
15: Base Graphics             

Selection: 1

  |                                                        |   0%

| In this lesson, we will explore some basic building blocks of
| the R programming language.

...

  |==                                                      |   3%

| If at any point you'd like more information on a particular
| topic related to R, you can type help.start() at the prompt,
| which will open a menu of resources (either within RStudio or
| your default web browser, depending on your setup).
| Alternatively, a simple web search often yields the answer
| you're looking for.

...

  |===                                                     |   5%

| In its simplest form, R can be used as an interactive
| calculator. Type 5 + 7 and press Enter.

> 5+7
[1] 12

| You are doing so well!

  |=====                                                   |   8%

| R simply prints the result of 12 by default. However, R is a
| programming language and often the reason we use a programming
| language as opposed to a calculator is to automate some process
| or avoid unnecessary repetition.

...

  |======                                                  |  11%

| In this case, we may want to use our result from above in a
| second calculation. Instead of retyping 5 + 7 every time we
| need it, we can just create a new variable that stores the
| result.

...

  |========                                                |  14%

| The way you assign a value to a variable in R is by using the
| assignment operator, which is just a 'less than' symbol
| followed by a 'minus' sign. It looks like this: <-

...

  |=========                                               |  16%

| Think of the assignment operator as an arrow. You are assigning
| the value on the right side of the arrow to the variable name
| on the left side of the arrow.

...

  |===========                                             |  19%

| To assign the result of 5 + 7 to a new variable called x, you
| type x <- 5 + 7. This can be read as 'x gets 5 plus 7'. Give it
| a try now.

> x<-5+7

| That's correct!

  |============                                            |  22%

| You'll notice that R did not print the result of 12 this time.
| When you use the assignment operator, R assumes that you don't
| want to see the result immediately, but rather that you intend
| to use the result for something else later on.

...

  |==============                                          |  24%

| To view the contents of the variable x, just type x and press
| Enter. Try it now.

> x
[1] 12

| That's correct!

  |===============                                         |  27%

| Now, store the result of x - 3 in a new variable called y.

> y<-x-3

| Your dedication is inspiring!

  |=================                                       |  30%

| What is the value of y? Type y to find out.

> y
[1] 9

| All that practice is paying off!

  |==================                                      |  32%

| Now, let's create a small collection of numbers called a
| vector. Any object that contains data is called a data
| structure and numeric vectors are the simplest type of data
| structure in R. In fact, even a single number is considered a
| vector of length one.

...

  |====================                                    |  35%

| The easiest way to create a vector is with the c() function,
| which stands for 'concatenate' or 'combine'. To create a vector
| containing the numbers 1.1, 9, and 3.14, type c(1.1, 9, 3.14).
| Try it now and store the result in a variable called z.

> z<-c(1.1,9,3.14)

| Excellent job!

  |=====================                                   |  38%

| Anytime you have questions about a particular function, you can
| access R's built-in help files via the `?` command. For
| example, if you want more information on the c() function, type
| ?c without the parentheses that normally follow a function
| name. Give it a try.

> ?c

| You are amazing!

  |=======================                                 |  41%

| Type z to view its contents. Notice that there are no commas
| separating the values in the output.

> z
[1] 1.10 9.00 3.14

| All that hard work is paying off!

  |========================                                |  43%

| You can combine vectors to make a new vector. Create a new
| vector that contains z, 555, then z again in that order. Don't
| assign this vector to a new variable, so that we can just see
| the result immediately.

> c(z,555,z)
[1]   1.10   9.00   3.14 555.00   1.10   9.00   3.14

| You got it!

  |==========================                              |  46%

| Numeric vectors can be used in arithmetic expressions. Type the
| following to see what happens: z * 2 + 100.

> z*2+100
[1] 102.20 118.00 106.28

| Nice work!

  |===========================                             |  49%

| First, R multiplied each of the three elements in z by 2. Then
| it added 100 to each element to get the result you see above.

...

  |=============================                           |  51%

| Other common arithmetic operators are `+`, `-`, `/`, and `^`
| (where x^2 means 'x squared'). To take the square root, use the
| sqrt() function and to take the absolute value, use the abs()
| function.

...

  |==============================                          |  54%

| Take the square root of z - 1 and assign it to a new variable
| called my_sqrt.

> my_sqrt<-sqrt(z-1)

| You are quite good my friend!

  |================================                        |  57%

| Before we view the contents of the my_sqrt variable, what do
| you think it contains?

1: a vector of length 3
2: a single number (i.e a vector of length 1)
3: a vector of length 0 (i.e. an empty vector)

Selection: 1

| Keep up the great work!

  |=================================                       |  59%

| Print the contents of my_sqrt.

> my_sqrt
[1] 0.3162278 2.8284271 1.4628739

| You are quite good my friend!

  |===================================                     |  62%

| As you may have guessed, R first subtracted 1 from each element
| of z, then took the square root of each element. This leaves
| you with a vector of the same length as the original vector z.

...

  |====================================                    |  65%

| Now, create a new variable called my_div that gets the value of
| z divided by my_sqrt.

> my_div<-z/my_sqrt

| Perseverance, that's the answer.

  |======================================                  |  68%

| Which statement do you think is true?

1: my_div is undefined
2: The first element of my_div is equal to the first element of z divided by the first element of my_sqrt, and so on...
3: my_div is a single number (i.e a vector of length 1)

Selection: 2

| You got it right!

  |=======================================                 |  70%

| Go ahead and print the contents of my_div.

> my_div
[1] 3.478505 3.181981 2.146460

| Your dedication is inspiring!

  |=========================================               |  73%

| When given two vectors of the same length, R simply performs
| the specified arithmetic operation (`+`, `-`, `*`, etc.)
| element-by-element. If the vectors are of different lengths, R
| 'recycles' the shorter vector until it is the same length as
| the longer vector.

...

  |==========================================              |  76%

| When we did z * 2 + 100 in our earlier example, z was a vector
| of length 3, but technically 2 and 100 are each vectors of
| length 1.

...

  |============================================            |  78%

| Behind the scenes, R is 'recycling' the 2 to make a vector of
| 2s and the 100 to make a vector of 100s. In other words, when
| you ask R to compute z * 2 + 100, what it really computes is
| this: z * c(2, 2, 2) + c(100, 100, 100).

...

  |=============================================           |  81%

| To see another example of how this vector 'recycling' works,
| try adding c(1, 2, 3, 4) and c(0, 10). Don't worry about saving
| the result in a new variable.

> c(1,2,3,4)+c(0,10)
[1]  1 12  3 14

| Great job!

  |===============================================         |  84%

| If the length of the shorter vector does not divide evenly into
| the length of the longer vector, R will still apply the
| 'recycling' method, but will throw a warning to let you know
| something fishy might be going on.

...

  |================================================        |  86%

| Try c(1, 2, 3, 4) + c(0, 10, 100) for an example.

> c(1,2,3,4)+c(0,10,100)
[1]   1  12 103   4
Warning message:
In c(1, 2, 3, 4) + c(0, 10, 100) :
  longer object length is not a multiple of shorter object length

| Nice work!

  |==================================================      |  89%

| Before concluding this lesson, I'd like to show you a couple of
| time-saving tricks.

...

  |===================================================     |  92%

| Earlier in the lesson, you computed z * 2 + 100. Let's pretend
| that you made a mistake and that you meant to add 1000 instead
| of 100. You could either re-type the expression, or...

...

  |=====================================================   |  95%

| In many programming environments, the up arrow will cycle
| through previous commands. Try hitting the up arrow on your
| keyboard until you get to this command (z * 2 + 100), then
| change 100 to 1000 and hit Enter. If the up arrow doesn't work
| for you, just type the corrected command.

> z*2+1000
[1] 1002.20 1018.00 1006.28

| All that practice is paying off!

  |======================================================  |  97%

| Finally, let's pretend you'd like to view the contents of a
| variable that you created earlier, but you can't seem to
| remember if you named it my_div or myDiv. You could try both
| and see what works, or...

...

  |========================================================| 100%

| You can type the first two letters of the variable name, then
| hit the Tab key (possibly more than once). Most programming
| environments will provide a list of variables that you've
| created that begin with 'my'. This is called auto-completion
| and can be quite handy when you have many variables in your
| workspace. Give it a try. (If auto-completion doesn't work for
| you, just type my_div and press Enter.)

> my_div
[1] 3.478505 3.181981 2.146460

| You are quite good my friend!

| Are you currently enrolled in the Coursera course associated
| with this lesson?

1: Yes
2: No

Selection: 2

| You've reached the end of this lesson! Returning to the main
| menu...