Wrong Selection in R

While reading the section on data frames in The Book of R, I got a result that was wrong from that in the exercise. Wrong because I had left out a comma in the code. Before I would just have corrected it and moved on, but this time round I wanted to know why it is I got that wrong result.

The data frame I was working with is

mydata Data Frame

The code I typed was mydata[mydata$sex=="F"] This resulted in the following output:

Unexpected Results

Not quite what I expected. I then typed out just the test itself mydata$sex=="F" and got the following vector of logicals.

Logicals vector

Looking at it, and comparing it to the output I got, things started making sense. The two columns I got back from my original code coincided with the logical TRUE values in the vector from the second code. What I had done by leaving out the comma was make a selection of columns instead of a selection of rows meeting the test criteria.

With this knowledge, I typed out the proper code mydata[mydata$sex=="F",] to get

Expected Results

The result I expected. Forcing myself to slow down and actually understand why code gives the output it does is doing wonders for my learning progress. Had I not understood from earlier reading and exercises how row and column selection coupled with logical values actually works, this would have been a much more difficult issue to figure out. Onwards with learning R!



Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.