Do more with R: Quick lookup tables using named vectors

Sharon Machlis November 30, 2018

0 3 minutes read

What’s the state abbreviation for Arkansas? Is it AR? AK? AS?

Maybe you’ve got a data frame with the information. Or any info where there’s one column with categories, and another column with values. Chances are, at some point you’d like to look up the value by category, sometimes known as the key. A lot of programming languages have ways to work with key-value pairs. This is easy to do in R, too, with named vectors. Here’s how.

I’ve got data with state names and abbreviations, which I’ve stored in a data frame named postal_df. (The code to create that data frame is at the bottom of this post if you’d like to follow along).

I’ll run tail(postal_df) to see what that looks like.

           State PostalCode
45       Vermont         VT
46      Virginia         VA
47    Washington         WA
48 West Virginia         WV
49     Wisconsin         WI
50       Wyoming         WY

A lookup table/named vector has values as the vector, and keys as the names. So let me first make a vector of the values, which are in the PostalCode column:

getpostalcode <- postal_df$PostalCode

And next I add names from the State column.

names(getpostalcode) <- postal_df$State

To use this named vector as a lookup table, the format is mylookupvector[‘key’].

So here’s how to get the postal code for Arkansas:

getpostalcode['Arkansas']

If you want just the value, without the key, add the unname function to that value you get back:

unname(getpostalcode[‘Arkansas’])

That’s all there is to it. I know this is a somewhat trivial example, but it has some real-world use. For example, I’ve got a named vector of FIPS codes that I need when working with US Census data.

I started with a data frame of states and FIPS codes called fipsdf (the code for that is below). Next, I created a vector called getfips from the data frame’s fips code column and added the states as names.

fipsdf <- rio::import("data/FIPS.csv")
getfips <- fipsdf$FIPS
names(getfips) <- fipsdf$State

Now if I want the FIPS code for Massachusetts, I can use getfips['Massachusetts'] . I would add unname() to get just the value without the name: unname(getfips['Massachusetts']) .

If having to keep using unname() gets too annoying, you can even make a little function from your lookup table:

get_state_fips <- function(state, lookupvector = getfips){
fipscode <- unname(lookupvector[state])
return(fipscode)
}

Here, I’ve got two arguments to my function. One is my “key,” in this case the state name; the other is lookupvector, which defaults to my getfips vector.

And you can see how I use the function. It’s just the function name with one argument, the state name: get_state_fips("New York") .

I can make a function that looks a bit more generic, such as

get_value <- function(mykey, mylookupvector){
myvalue <- mylookupvector[mykey]
myvalue <- unname(myvalue)
return(myvalue)
}

It has a more generic name for the function, get_value(); a more generic first argument name, mykey, and a second argument of mylookupvector that doesn’t default to anything.

It’s the same thing I’ve been doing all along: getting the value from the lookup vector with lookupvector['key'] and then running the unname() function. But it’s all wrapped inside a function. So, calling it is a bit more elegant.

I can use that function with any named vector I’ve created. Here, I’m using it with Arkansas and my getpostalcode vector: get_value("Arkansas", getpostalcode) .

Easy lookups in R! Just remember that names have to be unique. You can repeat values, but not keys.

I first saw this idea years ago in Hadley Wickham’s Advanced R book. I still use it a lot and hope you find it helpful, too.

Code to create data frame with postal abbreviations

postal_df <- data.frame(stringsAsFactors=FALSE,
State = c("Alabama", "Alaska", "Arizona", "Arkansas", "California",
"Colorado", "Connecticut", "Delaware", "Florida", "Georgia",
"Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas",
"Kentucky", "Louisiana", "Maine", "Maryland", "Massachusetts",
"Michigan", "Minnesota", "Mississippi", "Missouri", "Montana",
"Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico",
"New York", "North Carolina", "North Dakota", "Ohio",
"Oklahoma", "Oregon", "Pennsylvania", "Rhode Island", "South Carolina",
"South Dakota", "Tennessee", "Texas", "Utah", "Vermont",
"Virginia", "Washington", "West Virginia", "Wisconsin", "Wyoming"),
PostalCode = c("AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE", "FL", "GA",
"HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MD",
"MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ",
"NM", "NY", "NC", "ND", "OH", "OK", "OR", "PA", "RI", "SC", "SD",
"TN", "TX", "UT", "VT", "VA", "WA", "WV", "WI", "WY")
)

Code to create data frame with FIPS codes

fipsdf <- data.frame(State = c("Alabama", "Alaska", "Arizona", "Arkansas", 
"California", "Colorado", "Connecticut", "Delaware", "Florida", 
"Georgia", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", 
"Kansas", "Kentucky", "Louisiana", "Maine", "Maryland", "Massachusetts", 
"Michigan", "Minnesota", "Mississippi", "Missouri", "Montana", 
"Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico", 
"New York", "North Carolina", "North Dakota", "Ohio", "Oklahoma", 
"Oregon", "Pennsylvania", "Rhode Island", "South Carolina", "South Dakota", 
"Tennessee", "Texas", "Utah", "Vermont", "Virginia", "Washington", 
"West Virginia", "Wisconsin", "Wyoming"), FIPS = c("01", "02", 
"04", "05", "06", "08", "09", "10", "12", "13", "15", "16", "17", 
"18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", 
"29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", 
"40", "41", "42", "44", "45", "46", "47", "48", "49", "50", "51", 
"53", "54", "55", "56"), stringsAsFactors = FALSE)

Sharon Machlis November 30, 2018

0 3 minutes read

Do more with R: Quick lookup tables using named vectors

Code to create data frame with postal abbreviations

Code to create data frame with FIPS codes

Sharon Machlis

Mobile: Expert Review: Samsung Propel ™ Pro Cellular Phone

Callcentric price plans – technology made budget-friendly

Why a Slack acquisition would make sense for Salesforce

Best practices for working with Amazon Aurora Serverless

PHP MySQL BLOB PDF: Display in Browser

The Chosen one

…. The GAME ….

gsmarena_012-jpg

exploring mysql binlog server ripple

Callcentric price plans – technology made budget-friendly

Computer Networking Fundamentals

7 Tips for Training Children Scientific Research

Mobile: Expert Review: Samsung Jack ™ Cell PhoneCall high…

Mobile: Expert Review: Samsung Propel ™ Pro Cellular Phone

Mobile: Best of the Mobile WebOffering up place as well as …

Linux: Find Files Containing Text

image captionUS regulatory authorities will review authorizations for two coronavirus vaccines this month

Are you prepared? 10 steps to becoming a millionaire in your thirties

Samsung Display teases tri-folding screen and rollable devices

iPhone 12 Pro Max vs. Mate 40 Pro vs. Xperia 1 II vs. Galaxy Note20 Ultra

Callcentric price plans – technology made budget-friendly

Why a Slack acquisition would make sense for Salesforce

Best practices for working with Amazon Aurora Serverless

PHP MySQL BLOB PDF: Display in Browser

Automating database migration monitoring with AWS DMS

The Chosen one

Mobile : Best of the Mobile Web

Mobile : Expert Review: Samsung Propel™ Pro Cell Phone

Mobile : Expert Review: Samsung Jack™ Cell Phone

Mobile : Expert Review: Samsung Epix™ Cell Phone

Mobile : Expert Review: Samsung Gravity™2 Cell Phone

Code to create data frame with postal abbreviations

Code to create data frame with FIPS codes

With Product You Purchase

Subscribe to our mailing list to get the new updates!

Related Articles