Data <new name>;
Retain <variables>; can be used to set the order of variables in the table (must come before the set statement)
Set <libref.file>; libref not required if the file is in the working library
Length <variable><$> W or <variable> w.d; where W=number of print places for character variable or for numerical W=bites and .d = decimal places
Assign new variables; type new variable name and then define
Label;
Format;
Where; create a data subset by limiting the table to a particular level of a variable
Drop <variables>; if the drop statement is used within set statement then variables are not read into the PDV (drop= ) and are thus not available for processing
Keep <variables>; limits the variables that are saved to the new dataset
If condition;
run;
Multiple data sets an be created in one data step, which is best done with a select function:
Select (<variable>);
When ('level1') output <data1>;
When ('level2') output <data2>;
Otherwise; optional to otherwise output other
end;
To avoid errors due to capitalisation, use the upcase or lowcase function, e.g., select (lowcase(<variable>))
To collapse levels of a variable, e.g, to collapse value 3 into 2
<variable2> = <variable1>;
if variable1=3 then variable2=2;
Converting character to numerical variables:
numvar = INPUT(charvar, best32.);
or
<variable> = <variable> + 0