| Please choose a course, or type 0 to exit swirl.
1: R Programming 2: Take me to the swirl course repository!
Selection: 1
| Please choose a lesson, or type 0 to return to course menu.
1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers 4: Vectors 5: Missing Values 6: Subsetting Vectors 7: Matrices and Data Frames 8: Logic 9: Functions 10: lapply and sapply 11: vapply and tapply 12: Looking at Data 13: Simulation 14: Dates and Times 15: Base Graphics
Selection: 6
| | 0%
| In this lesson, we'll see how to extract elements from a vector | based on some conditions that we specify.
...
|= | 3%
| For example, we may only be interested in the first 20 elements | of a vector, or only the elements that are not NA, or only | those that are positive or correspond to a specific variable of | interest. By the end of this lesson, you'll know how to handle | each of these scenarios.
...
|=== | 5%
| I've created for you a vector called x that contains a random | ordering of 20 numbers (from a standard normal distribution) | and 20 NAs. Type x now to see what it looks like.
> x [1] NA 1.01612351 0.17390520 NA -0.62466706 [6] NA -2.57269671 NA -0.44002462 NA [11] 0.37101633 0.65818630 1.03885003 0.16175551 NA [16] -0.32999611 NA NA NA 0.40024254 [21] NA 0.53018587 NA NA NA [26] NA NA 0.28211580 -0.04009442 NA [31] 0.79493463 0.60598426 NA -1.42021598 NA [36] 0.17550349 0.39153186 NA 1.07989501 NA
| You are really on a roll!
|==== | 8%
| The way you tell R that you want to select some particular | elements (i.e. a 'subset') from a vector is by placing an | 'index vector' in square brackets immediately following the | name of the vector.
...
|====== | 11%
| For a simple example, try x[1:10] to view the first ten | elements of x.
> x[1:10] [1] NA 1.0161235 0.1739052 NA -0.6246671 [6] NA -2.5726967 NA -0.4400246 NA
| All that hard work is paying off!
|======= | 13%
| Index vectors come in four different flavors | vectors, vectors of positive integers, vectors of negative | integers, and vectors of character strings | we'll cover in this lesson.
...
|========= | 16%
| Let's start by indexing with logical vectors. One common | scenario when working with real-world data is that we want to | extract all elements of a vector that are not NA (i.e. missing | data). Recall that is.na(x) yields a vector of logical values | the same length as x, with TRUEs corresponding to NA values in | x and FALSEs corresponding to non-NA values in x.
...
|========== | 18%
| What do you think x[is.na(x)] will give you?
1: A vector of length 0 2: A vector of TRUEs and FALSEs 3: A vector with no NAs 4: A vector of all NAs
Selection: 4
| You are amazing!
|============ | 21%
| Prove it to yourself by typing x[is.na(x)].
> x[is.na(x)] [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
| Excellent job!
|============= | 24%
| Recall that `!` gives us the negation of a logical expression, | so !is.na(x) can be read as 'is not NA'. Therefore, if we want | to create a vector called y that contains all of the non-NA | values from x, we can use y <- x[!is.na(x)]. Give it a try.
> y<-x[!is.na(x)]
| Your dedication is inspiring!
|=============== | 26%
| Print y to the console.
> y [1] 1.01612351 0.17390520 -0.62466706 -2.57269671 -0.44002462 [6] 0.37101633 0.65818630 1.03885003 0.16175551 -0.32999611 [11] 0.40024254 0.53018587 0.28211580 -0.04009442 0.79493463 [16] 0.60598426 -1.42021598 0.17550349 0.39153186 1.07989501
| Keep working like that and you'll get there!
|================ | 29%
| Now that we've isolated the non-missing values of x and put | them in y, we can subset y as we please.
...
|================== | 32%
| Recall that the expression y > 0 will give us a vector of | logical values the same length as y, with TRUEs corresponding | to values of y that are greater than zero and FALSEs | corresponding to values of y that are less than or equal to | zero. What do you think y[y > 0] will give you?
1: A vector of all NAs 2: A vector of all the positive elements of y 3: A vector of length 0 4: A vector of all the negative elements of y 5: A vector of TRUEs and FALSEs
Selection: 2
| All that hard work is paying off!
|=================== | 34%
| Type y[y > 0] to see that we get all of the positive elements | of y, which are also the positive elements of our original | vector x.
> y[y>0] [1] 1.0161235 0.1739052 0.3710163 0.6581863 1.0388500 0.1617555 [7] 0.4002425 0.5301859 0.2821158 0.7949346 0.6059843 0.1755035 [13] 0.3915319 1.0798950
| You're the best!
|===================== | 37%
| You might wonder why we didn't just start with x[x > 0] to | isolate the positive elements of x. Try that now to see why.
> x[x>0] [1] NA 1.0161235 0.1739052 NA NA NA [7] NA 0.3710163 0.6581863 1.0388500 0.1617555 NA [13] NA NA NA 0.4002425 NA 0.5301859 [19] NA NA NA NA NA 0.2821158 [25] NA 0.7949346 0.6059843 NA NA 0.1755035 [31] 0.3915319 NA 1.0798950 NA
| Keep up the great work!
|====================== | 39%
| Since NA is not a value, but rather a placeholder for an | unknown quantity, the expression NA > 0 evaluates to NA. Hence | we get a bunch of NAs mixed in with our positive numbers when | we do this.
...
|======================== | 42%
| Combining our knowledge of logical operators with our new | knowledge of subsetting, we could do this | 0]. Try it out.
> x[!is.na(x)&x>0] [1] 1.0161235 0.1739052 0.3710163 0.6581863 1.0388500 0.1617555 [7] 0.4002425 0.5301859 0.2821158 0.7949346 0.6059843 0.1755035 [13] 0.3915319 1.0798950
| You got it right!
|========================= | 45%
| In this case, we request only values of x that are both | non-missing AND greater than zero.
...
|=========================== | 47%
| I've already shown you how to subset just the first ten values | of x using x[1:10]. In this case, we're providing a vector of | positive integers inside of the square brackets, which tells R | to return only the elements of x numbered 1 through 10.
...
|============================ | 50%
| Many programming languages use what's called 'zero-based | indexing', which means that the first element of a vector is | considered element 0. R uses 'one-based indexing', which (you | guessed it!) means the first element of a vector is considered | element 1.
...
|============================= | 53%
| Can you figure out how we'd subset the 3rd, 5th, and 7th | elements of x? Hint | element numbers as a numeric vector.
> x[c(3,5,7)] [1] 0.1739052 -0.6246671 -2.5726967
| Keep up the great work!
|=============================== | 55%
| It's important that when using integer vectors to subset our | vector x, we stick with the set of indexes {1, 2, ..., 40} | since x only has 40 elements. What happens if we ask for the | zeroth element of x (i.e. x[0])? Give it a try.
> x[0] numeric(0)
| You're the best!
|================================ | 58%
| As you might expect, we get nothing useful. Unfortunately, R | doesn't prevent us from doing this. What if we ask for the | 3000th element of x? Try it out.
> x[3000] [1] NA
| Great job!
|================================== | 61%
| Again, nothing useful, but R doesn't prevent us from asking for | it. This should be a cautionary tale. You should always make | sure that what you are asking for is within the bounds of the | vector you're working with.
...
|=================================== | 63%
| What if we're interested in all elements of x EXCEPT the 2nd | and 10th? It would be pretty tedious to construct a vector | containing all numbers 1 through 40 EXCEPT 2 and 10.
...
|===================================== | 66%
| Luckily, R accepts negative integer indexes. Whereas x[c(2, | 10)] gives us ONLY the 2nd and 10th elements of x, x[c(-2, | -10)] gives us all elements of x EXCEPT for the 2nd and 10 | elements. Try x[c(-2, -10)] now to see this.
> x[c(-2,-10)] [1] NA 0.17390520 NA -0.62466706 NA [6] -2.57269671 NA -0.44002462 0.37101633 0.65818630 [11] 1.03885003 0.16175551 NA -0.32999611 NA [16] NA NA 0.40024254 NA 0.53018587 [21] NA NA NA NA NA [26] 0.28211580 -0.04009442 NA 0.79493463 0.60598426 [31] NA -1.42021598 NA 0.17550349 0.39153186 [36] NA 1.07989501 NA
| Excellent job!
|====================================== | 68%
| A shorthand way of specifying multiple negative numbers is to | put the negative sign out in front of the vector of positive | numbers. Type x[-c(2, 10)] to get the exact same result.
> x[-c(2,10)] [1] NA 0.17390520 NA -0.62466706 NA [6] -2.57269671 NA -0.44002462 0.37101633 0.65818630 [11] 1.03885003 0.16175551 NA -0.32999611 NA [16] NA NA 0.40024254 NA 0.53018587 [21] NA NA NA NA NA [26] 0.28211580 -0.04009442 NA 0.79493463 0.60598426 [31] NA -1.42021598 NA 0.17550349 0.39153186 [36] NA 1.07989501 NA
| All that hard work is paying off!
|======================================== | 71%
| So far, we've covered three types of index vectors | positive integer, and negative integer. The only remaining type | requires us to introduce the concept of 'named' elements.
...
|========================================= | 74%
| Create a numeric vector with three named elements using vect <- | c(foo = 11, bar = 2, norf = NA).
> vect<-c(foo=11,bar=2,norf=NA)
| Great job!
|=========================================== | 76%
| When we print vect to the console, you'll see that each element | has a name. Try it out.
> vect foo bar norf 11 2 NA
| You are amazing!
|============================================ | 79%
| We can also get the names of vect by passing vect as an | argument to the names() function. Give that a try.
> names(vect) [1] "foo" "bar" "norf"
| You are quite good my friend!
|============================================== | 82%
| Alternatively, we can create an unnamed vector vect2 with c(11, | 2, NA). Do that now.
> vect2<-c(11,2,NA)
| That's the answer I was looking for.
|=============================================== | 84%
| Then, we can add the `names` attribute to vect2 after the fact | with names(vect2) <- c("foo", "bar", "norf"). Go ahead.
> names(vect2)<-c("foo","bar","norf")
| You got it right!
|================================================= | 87%
| Now, let's check that vect and vect2 are the same by passing | them as arguments to the identical() function.
> identical(vect,vect2) [1] TRUE
| Great job!
|================================================== | 89%
| Indeed, vect and vect2 are identical named vectors.
...
|==================================================== | 92%
| Now, back to the matter of subsetting a vector by named | elements. Which of the following commands do you think would | give us the second element of vect?
1: vect["bar"] 2: vect["2"] 3: vect[bar]
Selection: 1
| You are really on a roll!
|===================================================== | 95%
| Now, try it out.
> vect["bar"] bar 2
| All that hard work is paying off!
|======================================================= | 97%
| Likewise, we can specify a vector of names with vect[c("foo", | "bar")]. Try it out.
> vect[c("foo","bar")] foo bar 11 2
| You are doing so well!
|========================================================| 100%
| Now you know all four methods of subsetting data from vectors. | Different approaches are best in different scenarios and when | in doubt, try it out!
...
| Are you currently enrolled in the Coursera course associated | with this lesson?
1: Yes 2: No
Selection: 2
| You've reached the end of this lesson! Returning to the main | menu...
|