


DM(1STAT)           UNIX Programmer's Manual            DM(1STAT)



NNNNAAAAMMMMEEEE
     dm - data manipulator: formatting and conditional transfor-
     mations

SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
     ddddmmmm [Efile] [expressions]

DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
     _d_m is a data manipulating program that allows you to extract
     columns from a file, possibly based on conditions, and pro-
     duce algebraic combinations of columns.  _d_m reads from the
     standard input (via redirection with < or piped with |) and
     writes to the standard output.  To use _d_m, you write a
     series of expressions, and for each line of the input, _d_m
     reevaluates and prints the values of those expressions.  _d_m
     provides you with much of the power of an interpreted gen-
     eral purpose programming language, but is much more con-
     venient.  _d_m does many transformations of data one would
     usually need to write and compile a one-shot program for.

     _d_m allows you to access the fields on each line of its
     input.  Numerical values of fields on a line can be accessed
     by the letter 'x' followed by a column number.  Character
     strings can be accessed by the letter 's' followed by a
     column number.  For example, for the input line:
                             12 45.2 red
     s1 is the string '12', x2 is the number 45.2 (which is not
     the same as s2, the string '45.2'), and s3 is the string
     'red'.

     _C_o_l_u_m_n _E_x_t_r_a_c_t_i_o_n. Simple column extraction can be accom-
     plished by typing the strings from the columns desired.  For
     example, to extract the third, eighth, first and second
     columns (in that order) from "file," one would type:
                        dm s3 s8 s1 s2 < file

     _A_l_g_e_b_r_a_i_c _T_r_a_n_s_f_o_r_m_a_t_i_o_n_s. Algebraic operations involving
     the numerical values of fields can be accomplished in a
     straightforward manner.  To print, in order, the sum of the
     first two columns, the difference of the next two columns,
     and the square root of the sum of squares of the first and
     third columns, one could type the command:
                dm "x1+x2" "x3-x4" "(x1*x1+x3*x3)^.5"
     There are a number of the usual mathematical functions that
     allow expressions like:
             dm "exp(x1) + log(log(x2))" "floor (x1/x2)"

     _T_e_s_t_i_n_g _C_o_n_d_i_t_i_o_n_s. Expressions can be conditionally
     evaluated by testing the values of other expressions.  For
     example, to print the ratio of x1 and x2, one might want to
     check the value of x2 before division and print 'error' if
     x2 is 0.0.  This could be done by the command:



Printed 11/22/82        October 16, 1980                        1






DM(1STAT)           UNIX Programmer's Manual            DM(1STAT)



                dm "if x2 = 0 then 'error' else x1/x2"
     Or one might want to extract only those lines in which two
     columns have the same lexical value, such as searching for
     matching responses.  If the obtained response is in column
     five and the correct response is in column two, this could
     be accomplished with:
                 dm "if s5 = s2 then INPUT else KILL"
     INPUT is a string variable that is equal to the current
     input line and KILL is a control primitive that terminates
     execution for the current line.

     _O_t_h_e_r _F_e_a_t_u_r_e_s. _d_m offers a full set of comparison, alge-
     braic, and logical operators.  It also features a set of
     special variables that hold useful information for you and
     allow taking control in exceptional conditions.  These
     include: INPUT, the current input line; N, the number of
     fields in INPUT; SUM, the sum of the columns in the INPUT;
     RAND, a uniform random number; NIL, an expression that
     causes no output; KILL, which terminates evaluation on INPUT
     and goes to the next line; and EXIT, which terminates all
     processing.

SSSSEEEEEEEE AAAALLLLSSSSOOOO
     abut(1CSL),
     G. Perlman, _d_m - _A _d_a_t_a _m_a_n_i_p_u_l_a_t_o_r, Cognitive Science
     Laboratory.

     G. Perlman, Data analysis programs for the UNIX operating
     system, _B_e_h_a_v_i_o_r _R_e_s_e_a_r_c_h _M_e_t_h_o_d_s _a_n_d _I_n_s_t_r_u_m_e_n_t_a_t_i_o_n, 1980,
     _1_2, 554-558.

AAAAUUUUTTTTHHHHOOOORRRR
     Gary Perlman

BBBBUUUUGGGGSSSS

















Printed 11/22/82        October 16, 1980                        2






