Skip to contents

Retrieve a copy of the internal lookup table for all known ICD codes.

Usage

get_icd_codes(with.descriptions = FALSE, with.hierarchy = FALSE)

Arguments

with.descriptions

Logical scalar, if TRUE include the description of the codes.

with.hierarchy

Logical scalar, if TRUE include the ICD hierarchy.

Value

a data.frame

The default return has the following columns:

  • icdv: Integer vector indicating if the code is from ICD-9 or ICD-10

  • dx: Integer vector. 1 if the code is a diagnostic, (ICD-9-CM, ICD-10-CM, WHO, CDC Mortality), or 0 if the code is procedural (ICD-9-PCS, ICD-10-PCS)

  • full_code: Character vector with the ICD code and any relevant decimal point

  • code: Character vector with the compact ICD code omitting any relevant decimal point

  • src: Character vector reporting the source of the information. See Details.

  • known_start: Integer vector reporting the first known year of use. See Details.

  • known_end: Integer vector reporting the last known year of use. See Details.

  • assignable_start: Integer vector reporting the first known year the code was assignable. See Details.

  • assignable_end: Integer vector reporting the last known year the code was assignable. See Details.

When with.descriptions = TRUE there are the following additional columns:

  • desc: Character vector of descriptions. For cms codes descriptions from CMS are used preferentially over CDC.

  • desc_start: Integer vector of the first year the description was used.

  • desc_end: Integer vector of the last year the description was used.

When with.hierarchy = TRUE there are the following additional columns:

  • chapter

  • subchapter

  • category

  • subcategory

  • subclassification

  • subsubclassification

  • extension

Details

Sources

There are three sources of ICD codes.

  • cms: Codes from the ICD-9-CM, ICD-9-PCS, ICD-10-CM, and ICD-10-PCS standards.

  • who: Codes from World Health Organization.

  • cdc: Codes from CDC Mortality coding standard.

Fiscal and Calendar Years

When reporting years there is a mix of fiscal and calendar years.

Fiscal years are the United States Federal Government fiscal years, running from October 1 to September 30. For example, fiscal year 2013 started October 1 2012 and ended on September 30 2013.

Calendar years run January 1 to December 31.

Within the ICD data there are columns known_start, known_end, assignable_start, assignable_end, desc_start and desc_end. For ICD codes with src == "cms", these are fiscal years. For codes with src == "cdc" or src == "who" these are calendar years.

known_start is the first fiscal or calendar year (depending on source) that the medicalcoder package as definitive source data for. ICD-9-CM started in the United States in fiscal year 1980. Source information that could be downloaded from the CDC and CMS and added to the source code for the medicalcoder package goes back to 1997. As such 1997 is the "known start"

known_end is the last fiscal or calendar year (depending on source) for which we have definitive source data for. For ICD-9-CM and ICD-9-PCS that is 2015. For ICD-10-CM and ICD-10-PCS, which are active, it is just the last year of known data. ICD-10 from the WHO ends in 2019.

Header and Assignable Codes

"Assignable" indicates that the code is the most granular for the source. Ideally codes are reported with the greatest level of detail but that is not always the case. Also, the greatest level of detail can differ between sources. Example: C86 is a header code for cms and who because codes C86.0, C86.1, C86.2, C86.3, C86.4, C86.5, and C86.6 all exist in both standards. No code with a fifth digit exists in the who so all these four digit codes are 'assignable.' In the cms standard, C86.0 was assignable through fiscal year 2024. In fiscal year 2025 codes C86.00 and C86.01 were added making C86.0 a header code and C86.00 and C86.01 assignable codes.

Examples

icd_codes <- get_icd_codes()
str(icd_codes)
#> 'data.frame':	227534 obs. of  9 variables:
#>  $ icdv            : int  9 9 9 9 9 9 9 9 9 9 ...
#>  $ dx              : int  0 0 0 0 0 0 1 0 1 0 ...
#>  $ full_code       : chr  "00" "00.0" "00.01" "00.02" ...
#>  $ code            : chr  "00" "000" "0001" "0002" ...
#>  $ src             : chr  "cms" "cms" "cms" "cms" ...
#>  $ known_start     : int  2003 2003 2003 2003 2003 2003 1997 2003 1997 2003 ...
#>  $ known_end       : int  2015 2015 2015 2015 2015 2015 2015 2015 2015 2015 ...
#>  $ assignable_start: int  NA NA 2003 2003 2003 2003 NA NA 1997 2003 ...
#>  $ assignable_end  : int  NA NA 2015 2015 2015 2015 NA NA 2015 2015 ...

# Explore the change in the assignable year for C86 code between CMS and
# WHO
subset(get_icd_codes(), grepl("^C86$", full_code))
#>        icdv dx full_code code src known_start known_end assignable_start
#> 106157   10  1       C86  C86 cms        2014      2026               NA
#> 106158   10  1       C86  C86 who        2010      2019               NA
#>        assignable_end
#> 106157             NA
#> 106158             NA
subset(get_icd_codes(), grepl("^C86\\.\\d$", full_code))
#>        icdv dx full_code code src known_start known_end assignable_start
#> 106159   10  1     C86.0 C860 cms        2014      2026             2014
#> 106160   10  1     C86.0 C860 who        2010      2019             2010
#> 106163   10  1     C86.1 C861 cms        2014      2026             2014
#> 106164   10  1     C86.1 C861 who        2010      2019             2010
#> 106167   10  1     C86.2 C862 cms        2014      2026             2014
#> 106168   10  1     C86.2 C862 who        2010      2019             2010
#> 106171   10  1     C86.3 C863 cms        2014      2026             2014
#> 106172   10  1     C86.3 C863 who        2010      2019             2010
#> 106175   10  1     C86.4 C864 cms        2014      2026             2014
#> 106176   10  1     C86.4 C864 who        2010      2019             2010
#> 106179   10  1     C86.5 C865 cms        2014      2026             2014
#> 106180   10  1     C86.5 C865 who        2010      2019             2010
#> 106183   10  1     C86.6 C866 cms        2014      2026             2014
#> 106184   10  1     C86.6 C866 who        2010      2019             2010
#>        assignable_end
#> 106159           2024
#> 106160           2019
#> 106163           2024
#> 106164           2019
#> 106167           2024
#> 106168           2019
#> 106171           2024
#> 106172           2019
#> 106175           2024
#> 106176           2019
#> 106179           2024
#> 106180           2019
#> 106183           2024
#> 106184           2019
subset(get_icd_codes(), grepl("^C86\\.0(\\d|$)", full_code))
#>        icdv dx full_code  code src known_start known_end assignable_start
#> 106159   10  1     C86.0  C860 cms        2014      2026             2014
#> 106160   10  1     C86.0  C860 who        2010      2019             2010
#> 106161   10  1    C86.00 C8600 cms        2025      2026             2025
#> 106162   10  1    C86.01 C8601 cms        2025      2026             2025
#>        assignable_end
#> 106159           2024
#> 106160           2019
#> 106161           2026
#> 106162           2026

is_icd("C86", headerok = FALSE) # FALSE
#> [1] FALSE
is_icd("C86", headerok = TRUE)  # TRUE
#> [1] TRUE
is_icd("C86", headerok = TRUE, src = "cdc") # Not a CDC mortality code
#> [1] FALSE

lookup_icd_codes("^C86\\.0\\d*", regex = TRUE)
#>    input_regex match_type icdv dx full_code  code src known_start known_end
#> 1 ^C86\\.0\\d*  full_code   10  1     C86.0  C860 who        2010      2019
#> 2 ^C86\\.0\\d*  full_code   10  1     C86.0  C860 cms        2014      2026
#> 5 ^C86\\.0\\d*  full_code   10  1    C86.00 C8600 cms        2025      2026
#> 6 ^C86\\.0\\d*  full_code   10  1    C86.01 C8601 cms        2025      2026
#>   assignable_start assignable_end
#> 1             2010           2019
#> 2             2014           2024
#> 5             2025           2026
#> 6             2025           2026