Skip to contents
## [1] '0.0.0.9000'

Documentation and construction of the provided data sets in Kutner et al. (2005).

Data Set C.1 SENIC

The primary object of the Study on the Efficiency of Nosocomial Infection Control (SENIC Project (“The SENIC Project” 1980)) was to determine whether infection surveillance and control programs have reduced the rates of nosocomial (hospital-acquired) infection in United States hospitals. The data set consists of a random sample of 113 hospitals selected from the original 338 hospital surveyed.

Each line of the data set has an identification number and provides information on 11 other variable for the single hospital. The data presented here are for the 1975-76 study period. The 12 variables are:

Variable Number Variable Name Description
1 id Identification Number: 1- 113
2 length_of_stay Average length of stay of all patients in hospital (in days)
3 age Average age of patients (in years)
4 infection_risk Average estimated probability of acquiring infection in hospital (in percent)
5 culturing_ratio Ratio of number of cultures performed to number of patients without signs or symptoms of hospital-acquired infection, times 100
6 chest_x_ray_ratio Ratio of number of X-ray performed to number of patients without signs or symptoms of pneumonia, times 100
7 number_of_beds Average number of beds in hospital during study period
8 medical_school_affiliation 1 = Yes, 2 = No
9 region Geographic Region, where 1 = NE, 2 = NC, 3 = S, 4 = W
10 average_daily_census Average number of patients in hospital per day during study period
11 number_of_nurses Average number of full-time equivalent registered and licensed practical nurses during study period (number of full time plus one half the number part time)
12 available_facilities_and_services Percent of 35 potential facilities and services that are provided by the hospital

To load the data into R use the following code.

SENIC <-
  read.table(
    file = system.file("datasets", "APPENC01.txt", package = "kutnerALSM5e")
  , header = FALSE
  , col.names = c(
      "id"
    , "length_of_stay"
    , "age"
    , "infection_risk"
    , "culturing_ratio"
    , "chest_x_ray_ratio"
    , "number_of_beds"
    , "medical_school_affiliation"
    , "region"
    , "average_daily_census"
    , "number_of_nurses"
    , "available_facilities_and_services"
    )
  )

# Modify the data to make analysis easier
SENIC[["medical_school_affiliation"]] <-
  as.integer(SENIC[["medical_school_affiliation"]] == 1)

SENIC[["region"]] <-
  factor(x = SENIC[["region"]],
         levels = 1:4,
         labels = c("North-East", "North-Central", "South", "West"))

head(SENIC)

id length_of_stay age infection_risk culturing_ratio chest_x_ray_ratio 1 1 7.13 55.7 4.1 9.0 39.6 2 2 8.82 58.2 1.6 3.8 51.7 3 3 8.34 56.9 2.7 8.1 74.0 4 4 8.95 53.7 5.6 18.9 122.8 5 5 11.20 56.5 5.7 34.5 88.9 6 6 9.76 50.9 5.1 21.9 97.0 number_of_beds medical_school_affiliation region average_daily_census 1 279 0 West 207 2 80 0 North-Central 51 3 107 0 South 82 4 147 0 West 53 5 180 0 North-East 134 6 150 0 North-Central 147 number_of_nurses available_facilities_and_services 1 241 60 2 52 40 3 54 20 4 148 40 5 151 40 6 106 40

References

Kutner, Michael H, Christopher J Nachtsheim, John Neter, and William Li. 2005. Applied Linear Statistical Models. 5th ed. McGraw-hill.
“The SENIC Project.” 1980. American Journal of Epidemiology 111 (5). https://academic.oup.com/aje/issue/111/5.

Session Info

## R version 4.4.0 (2024-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
## 
## locale:
##  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
##  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
##  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
## [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
## 
## time zone: UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] kutnerALSM5e_0.0.0.9000
## 
## loaded via a namespace (and not attached):
##  [1] vctrs_0.6.5       cli_3.6.2         knitr_1.46        rlang_1.1.3      
##  [5] xfun_0.43         purrr_1.0.2       textshaping_0.3.7 jsonlite_1.8.8   
##  [9] htmltools_0.5.8.1 ragg_1.3.1        sass_0.4.9        rmarkdown_2.26   
## [13] evaluate_0.23     jquerylib_0.1.4   fastmap_1.1.1     yaml_2.3.8       
## [17] lifecycle_1.0.4   memoise_2.0.1     compiler_4.4.0    fs_1.6.4         
## [21] systemfonts_1.0.6 digest_0.6.35     R6_2.5.1          magrittr_2.0.3   
## [25] bslib_0.7.0       tools_4.4.0       pkgdown_2.0.9     cachem_1.0.8     
## [29] desc_1.4.3