Vectors are the most frequently used data structures. Vectors in R are as fundamental as arrays in C and PHP, cons in Lisp, lists in Python, and classes in Java.

## What are Vectors in R?

N.B. *This article assumes that you have complete understanding of basic data types in R. If you don’t, read atomic data types in R, it shouldn’t take more than half an hour.*

A vector is a linear collection of elements or ‘components’ as they are more popularly called in R parlance. An element/ component/ member is the smallest unit of data that can be stored and has to be one of the six types described in the article on Atomic Data Types in R. Therefore, the vectors that are the subject of discussion in this article are sometimes also referred to as atomic vectors as opposed to lists that we will be discussing in upcoming tutorials.

If you are a C/C++, Java or PHP programmer then vectors are arrays for you. If you are a LISP or Python programmer then vectors are list for you. Remember that R has its own data structure called arrays. But arrays in R behave very differently from arrays in other programming language. Vectors is the closest R gets to having indexed, homogeneous data structures which are allocated contiguous memory.

## Salient Features of Vectors in R

There are three distinctive features of vectors in R –

- Vectors in R are homogeneous
- Vectors in R are ‘flat’ or non-recursive
- Indexing of vectors in R starts at 1 and not 0

### Vectors are homogeneous

Vectors are homogeneous – All elements in a vector should be of the same data type. You can have a vector of integers or doubles but you can’t have a vector which mixes ints and floats.

The only exception to this rule is the special value NA, read as “Not Available”. R treats NA as of logical data type, but when NA occurs in a vector, its type changes to match that of vector –

1 2 3 4 5 6 7 8 9 10 |
> > x <- c(1, 2, NA, 4, 5, NA) > typeof(x) [1] "double" > typeof(x[3]) [1] "double" > > typeof(NA) [1] "logical" > |

### Vectors in R are ‘flat’ or non-recursive

Vectors are ‘flat’. They are non-recursive data structures. Even if you try to append a vector to an existing vector, R will ‘flatten’ it out.

Every single element of an atomic vector (which is what we are discussing in this article) has to be one of the six atomic data types.

1 2 3 4 5 6 |
> > x <- c(1, 2, NA, 4, 5, NA) > y <- c(x,x) > y [1] 1 2 NA 4 5 NA 1 2 NA 4 5 NA > |

As you can see in the example above, the vector y is not composed of two vectors but of the elements of the two vectors. Vectors don’t nest vector. An attempt at nesting vectors leads to creation of a vector containing all elements in the vectors.

Circuitious paths lead nowhere –

1 2 3 4 5 6 |
> > x <- c(1, 2, NA, 4, 5, NA) > y <- c(x, c(x)) > y [1] 1 2 NA 4 5 NA 1 2 NA 4 5 NA > |

But the non-recursive nature of vectors does not rule out self-assignments such as the one shown below –

1 2 3 4 5 |
> > x <- c(1, 1, 2, 3, 5) > x <- c(x,x) > x [1] 1 1 2 3 5 1 1 2 3 5 |

R is a weakly typed language, it does automatic type casting (called coercion) to maintain homogenity. Like with all automagic things happening around you, if you can control them you benefit. If you remain a clueless spectator, you burn your fingers. Therefore, consider it a “bad code smell” when R automagically transforms data for you.

Here is a Fibonacci series that is probably going down the wrong path –

1 2 3 4 |
> > x <- c(1, TRUE, 2, 3, 5) > x [1] 1 1 2 3 5 |

### Vectors are not zero-indexed

(The above space has been intentionally left blank. When reading hard copy of this webpage, feel free to fill it with choicest *!$#$^@).

Gulp some cold water, re-read the heading and now splash some on your face.

The first element of a vector resides at, well, first place.

What?

That’s bizarre!

Weird!!

WT*!!!

Fortunately, this is probably the only place where R is idiosyncratic.

Personally, I am so hurt by this ‘feature’ of R that I am not even going to talk about it any further.

## Creating Vectors in R

There are five ways of creating vectors in R –

### 1. c or combine Function

This is the most common way of creating a vector in R. The c function coerces (if required), combines the arguments passed to it and returns a vector.

Consider the example below –

1 2 3 4 5 6 7 |
> > z <- c(1-1i, 1+1i, 1, 1+3i) > z [1] 1-1i 1+1i 1+0i 1+3i > typeof(z) [1] "complex" |

Here we passed 3 complex numbers and a numeric of value 1. R coerced the 1 from double to complex by treating it as 1+0i and returned a vector of complex numbers.

Before we move on to learn four more methods of creating vectors, remember that you might run into a R old-timer who would swear that c() is a short for concatenate and not combine. I don’t think R programmers can ever comprehend Shakespear, still “what’s in a name” holds true. Let it pass. There is tremendous value in being in the good books of others, especially old timers.

### 2. The : Operator

The colon (which has nothing to do with your guts) operator can be used to create a vector which has elements in ‘natural’ order.

Let’s apply colo(g)n(e) operator –

1 2 3 4 5 6 7 |
> > ints <- -5:5 > ints [1] -5 -4 -3 -2 -1 0 1 2 3 4 5 > typeof(ints) [1] "integer" > |

And the same logic can be extended to doubles or floats –

1 2 3 4 5 6 7 |
> > double <- 3.14 : 31.4 > double [1] 3.14 4.14 5.14 6.14 7.14 8.14 9.14 10.14 11.14 12.14 13.14 14.14 [13] 15.14 16.14 17.14 18.14 19.14 20.14 21.14 22.14 23.14 24.14 25.14 26.14 [25] 27.14 28.14 29.14 30.14 31.14 > |

### 3. The seq function

The seq function provides a little more control over the generation member elements.

Look at the trivial example shown below –

1 2 3 4 5 6 7 |
> > odd.numbers <- seq(1, 30, by=2) > odd.numbers [1] 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 > typeof(odd.numbers) [1] "double" > |

If you want integers (but why would you in R?) the way you got with : operator, you can use the by argument to specify the quantum of change in each step –

1 2 3 4 5 |
> > odd.numbers <- seq(1L, 30L, by=2L) > typeof(odd.numbers) [1] "integer" > |

You can also specify the length.out argument to specify total number of members you want in your vector. R will uniformly divide your upper and lower limits into length.out parts –

1 2 3 4 5 |
> > numbers <- seq(1L, 30L, length.out=5) > numbers [1] 1.00 8.25 15.50 22.75 30.00 > |

### 4. ‘constructor’ Function

We can also use ‘constructor’ function vector to create a, well, vector. The interesting part is you can specify the mode and the length. The mode has to be one of the values discussed in atomic types in R.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
> > v1 <- vector(mode="numeric", length=5) > v1 [1] 0 0 0 0 0 > v2 <- vector(mode='character', length = 3) > v2 [1] "" "" "" > v3 <- vector(mode="logical", length=3) > v3 [1] FALSE FALSE FALSE > v4 <- vector(mode="raw", length=3) > v4 [1] 00 00 00 > v5 <- vector(mode="complex", length=3) > v5 [1] 0+0i 0+0i 0+0i > v6 <- vector(mode="integer", length=3) > v6 [1] 0 0 0 > |

### 5. as.vector

The as.vector function takes an object as argument and tries to generate a vector out of the object. You can specify the mode argument to tell as.vector the mode of the resultant vector.

Let’s create vectors out of a single element int vector –

1 2 3 4 5 6 7 8 9 10 |
> > as.vector(32L, mode="raw") [1] 20 > as.vector(32L, mode="character") [1] "32" > as.vector(32L, mode="logical") [1] TRUE > as.vector(32L, mode="complex") [1] 32+0i > |

## How to access Elements of a Vector?

There are two ways in R of accessing elements of a vector –

- Indexing using the ‘[‘ operator. (There is a ‘[[‘ indexing, but we defer that until next article.)
- Using (rather than calling) names of elements/ members / components

### Indexing Vectors in R

As discussed earlier, vectors in R are analogous to arrays in some of the popular register based languages such as Java, C++ and C and lists . These data structures allow there elements or memberss to be accessed via ‘**[**‘ operator.

Consider the trivial example below –

1 2 3 4 5 6 7 8 |
> > morons <- c("marks", "linen", "satanil", "moa") # typos are intentional > morons[1] [1] "marks" > morons[3] [1] "satanil" > morons[0] character(0) |

This trivial example highlights the facts that –

- Indexes in R start from1 and not from 0 (We already had enough chest beating for that)
- In R the
**‘[‘**operator returns a slice of the vector - R doesn’t sound a ‘sky is falling’ kind of whistle when you access a non-existent array index. Accessing indexes outside the current boundary is a fatal error in Java, but not so in R. R simply adjusts the vector to ‘accomodate’ the index we are trying to get. When we tried to access the cherished element at index 0, R indicates in a convoluted language that index doesn’t exist.
- Morons of a certain ideology are almost always mass murderers as well.

#### Accessing non-existent index

As we learnt above that R doesn’t raise much of hue and cry if you try to invalid index. In fact, R will return a NA if you exceed beyond the boundary of the vector.

1 2 3 4 5 6 7 |
> > numbers [1] 1.00 8.25 15.50 22.75 30.00 > numbers[15] [1] NA |

Notice that when we accessed numbers[15], R didn’t fret. Instead it cooly said “NA” indicating that there is some problem with our logic. Fair and reasonable enough.

#### Accessing negative index

Most programming languages take you to task for trying to access elements by -ve value. (Notable exception being Python where negative index means count from right rather than left)

R interprets -ve index to mean that you want that particular index to be dropped and not included in the resultant vector. R returns all values as a vector except the negative indexes.

Let’s return to our morons for a short while –

1 2 3 4 5 6 |
> > morons [1] "marks" "linen" "satanil" > morons[-2] # let's get rid of grand daddy of evil empire [1] "marks" "satanil" > |

#### Accessing elements of a vector using a vector of indexes

Apologies. Title sounds complicated. But that’s the only thing complicated there.

The ‘[‘ operator also takes in a numeric vector as an input. The contents of numeric vector are assumed to indicate the indexes you want to retain in the origi al vector. Consider the example below –

1 2 3 4 5 6 7 |
> > continents <- c("Asia", "Africa", "Australia", "Antarctica", "N. America", "S.America", "Europe") > fascinating.continents <- continents[c(2, 4)] > fascinating.continents [1] "Africa" "Antarctica" > |

In this example we were able to creating a new vector called fascinating.continents by slicing the existing vector continents. The sliced indexes are themselves specified as a numeric vector of all the indexes that we wish to retain.

Sometimes you may want to repeat an element. R allows you to repeat the index as many times as you like. Suppose a fairy granted you three trips to continents. If you wish to use all your slots for the amazing cradle of civilization, R will not get in your way –

1 2 3 4 5 6 |
> > worth.visiting.continents = continents[c(2, 2, 2)] > worth.visiting.continents [1] "Africa" "Africa" "Africa" > |

And R doesn’t mind if you give the indexes in jumbled up order.

1 2 3 4 5 6 |
> > worth.visiting.continents = continents[c(4, 2, 3)] > worth.visiting.continents [1] "Antarctica" "Africa" "Australia" > |

#### Accessing elements of a vector formed from : operator

Again, the title is the only complicated stuff here.

R also allows you to specify range index using the : operator.

1 2 3 4 5 6 |
> > worth.visiting.continents = continents[2 : 4] > worth.visiting.continents [1] "Africa" "Australia" "Antarctica" > |

#### Accessing elements of a vector using vector of logicals

Another way of slicing a vector in R is to use a vector of logicals.

A TRUE or T indicates that the particular value needs to be included, and FALSE or F indicates that particular value is not included.

1 2 3 4 5 6 |
> > worth.visiting.continents = continents[c(F, T, FALSE, TRUE, FALSE, F, F)] > worth.visiting.continents [1] "Africa" "Antarctica" > |

### Accessing Vector Members Using Names

We can assign names to members of vectors and then use those names to access the elements –

1 2 3 4 5 6 7 8 9 10 11 |
> morons [1] "marks" "linen" "satanil" > names(morons) <- c("German" , "Russian" , "Georgian") > morons['Georgian'] Georgian "satanil" > morons German Russian Georgian "marks" "linen" "satanil" > |

## Arithmetic Operations on Vectors in R

Operations on vectors are carried out member-wise.

Consider the trivial example below –

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
> > hero = c(2, 4, 8, 16, 32, 64, 128) > zero = c(0) > hero + zero [1] 2 4 8 16 32 64 128 > hero * zero [1] 0 0 0 0 0 0 0 > hero / zero [1] Inf Inf Inf Inf Inf Inf Inf > hero - zero [1] 2 4 8 16 32 64 128 > |

No surprise there. The only thing to care about when doing arithmetic operations on two vectors is the recyclic rule.

### Recycling Rule for Vectors in R

This rules comes into play when the vectors that are operated upon are of unequal length. If R is asked to operate on two vectors of unequal length, it recycles (or reuses) the elements of shorter vector starting from index 1.

1 2 3 4 5 6 7 8 |
> > hero = c(2, 4, 8, 16, 32, 64) > zero = c(0,0) > hero / zero # a meaningless divide by zero operation for demo only [1] Inf Inf Inf Inf Inf Inf > |

If the length of the longer vector is not an integral multiple of length of the shorter vector, the recycling proceeds as per the rule, but R issues (ugly?) warning –

1 2 3 4 5 6 7 8 9 10 11 |
> > hero = c(2, 4, 8, 16, 32, 64, 128) > zero = c(0,0) > hero / zero # a meaningless divide by zero operation for demo only [1] Inf Inf Inf Inf Inf Inf Inf Warning message: In hero/zero : longer object length is not a multiple of shorter object length > |

## Conclusion

In this article we covered the basics of vectors in R, how to create vectors in R, how to access the elements and how to perform arithmetic operations on the elements of a vector.

Next we will examine another popular data structure – lists.

### We are social

Spread the wordFollow CodingRaptor

## Leave a Reply