This function combines several ways titles and names may need to be formatted. It's meant to be simple, yet flexible.

clean_titles(
  x,
  cap_all = FALSE,
  split_case = TRUE,
  keep_running_caps = TRUE,
  space = "_",
  remove = NULL
)

Arguments

x

A character vector

cap_all

Logical: if TRUE, first letter of each word after splitting will be capitalized. If FALSE, only the first character of the string will be capitalized. Note that in order to balance this with respecting consecutive capital letters, such as from acronyms,

split_case

Logical: if TRUE, consecutive lowercase-uppercase pairs will be treated as two words to be separated.

keep_running_caps

Logical: if TRUE, consecutive uppercase letters will be kept uppercase.

space

Character vector of characters and/or regex patterns that should be replaced with a space to separate words.

remove

Character vector of characters and/or regex patterns that will be removed before any other operations; if NULL, nothing is removed.

Value

A character vector with each item newly formatted

Details

Examples of possible common operations include:

  • "TownName" --> "Town Name"

  • "town_name" --> "Town Name"

  • "town_name" --> "Town name"

  • "RegionABC" --> "Region ABC"

  • "TOWN_NAME" --> "Town Name"

Examples

t1 <- c("GreaterNewHaven", "greater_new_haven", "GREATER_NEW_HAVEN")
clean_titles(t1, cap_all = TRUE, keep_running_caps = FALSE)
#> [1] "Greater New Haven" "Greater new haven" "Greater New Haven"

t2 <- c("Male!CollegeGraduates", "Male CollegeGraduates")
clean_titles(t2, space = c("_", "!"))
#> [1] "Male college graduates" "Male college graduates"

t3 <- c("Greater BPT Men", "Greater BPT Men HBP", "GreaterBPT_men", "greaterBPT")
clean_titles(t3, cap_all = FALSE)
#> [1] "Greater BPT men"     "Greater BPT men HBP" "Greater BPT men"    
#> [4] "Greater BPT"        

t4 <- c("New Haven town, New Haven County, Connecticut",
        "Newtown town, Fairfield County, Connecticut")
clean_titles(t4, cap_all = TRUE, remove = " town,.+")
#> [1] "New Haven" "Newtown"