This function adds a new column, based on existing columns, to all identified groups in the given data set.

add_derived_column(av_state, name, columns, operation = c("SUM", "AVG", "LN",
  "MINUTES_TO_HOURS", "SQUARED"), log_level = 0)

Arguments

av_state

an object of class av_state

name

the name of the new column

columns

the existing columns that the new column will be based on

operation

this argument has three possible values:

  • 'SUM' - The new column is the sum of the columns specified in the columns argument. So for this option, the columns argument is an array of column names. Values in the summation of columns that are NA are treated as if they're zero. Columns that are not numeric are transformed to numeric. For example, Factor columns are transformed to numbers starting at 0 for the first factor level.

  • 'AVG' - The new column is the average of the columns specified in the columns argument. For each row, the resulting column has the average value of all columns that are not NA on that row, or NA otherwise.

  • 'LN' - The new column is the natural logarithm of the specified column in columns. Thus, for this option, the columns argument is simply the name of a single column. This operation does not work on columns that are not numeric. Values in the original column that are NA are left as NA in the new column. Note that values are increased if necessary so that the resulting column has no negative values.

  • 'MINUTES_TO_HOURS' - The new column is the values of the specified column divided by 60. Thus, for this option, the columns argument is simply the name of a single column. This operation does not work on columns that are not numeric. Values in the original column that are NA are left as NA in the new column.

  • 'SQUARED' - The new column ise the square of the values of the specified column. Thus, for this option, the columns argument is simply the name of a single column. This operation does not work on columns that are not numeric. Values in the original column that are NA are left as NA in the new column.

log_level

sets the minimum level of output that should be shown (a number between 0 and 3). A lower level means more verbosity. Specify a log_level of 3 to hide messages about converting columns or increasing values for the 'LN' option.

Value

This function returns the modified av_state object.

Examples

# NOT RUN {
av_state <- load_file("../data/input/RuwedataAngela.sav")
av_state <- add_derived_column(av_state,'SomPHQ',c('PHQ1','PHQ2','PHQ3','PHQ4',
                               'PHQ5','PHQ6','PHQ7','PHQ8','PHQ9'),
                               operation='SUM')
column_names_output(av_state)
av_state <- load_file("../data/input/pp1 nieuw compleet.sav",log_level=3)
av_state <- add_derived_column(av_state,'SomBewegUur','SomBewegen',
                               operation='MINUTES_TO_HOURS')
av_state <- add_derived_column(av_state,'lnSomBewegUur','SomBewegUur',
                               operation='LN')
av_state$data[[1]][c('SomBewegen','SomBewegUur','lnSomBewegUur')]
# }