Notes on SAS
Here are just some basics notes on SAS. This is a compilation and more will be added as time passes. Some notes will be spun off as a page or blog entry.
The Variable's Length Attribute
The type of data that a variable contains is only one of the attributes a variable can have. The next attribute is the variable's length. SAS assigns a length of 8 to all variables unless you specify otherwise. When all the data for a variable does not fit the defaults, you have to give SAS special instructions. To describe variances from SAS's default assumptions, use an informat or a format. An informat tells SAS how to look at the data as it reads it. A format gives SAS special instructions about displaying the data.
Format vs. Informat
For example, you may want a number to be printed in a social security number format: nnn-nn-nnnn. A format tells SAS the pattern to use when displaying the data value. Or, as in the gasoline data, you have three separate data values that comprise a date. An informat tells SAS the pattern to use when reading in the three date values.
The final SAS attribute that can be specified is a label that is printed instead of the variable name. SAS procedures print the variable's name on reports, charts or graphs. To give a more descriptive name, you may specify a label.
Telling SAS where to get the data from (File-Handling Statements)
You tell SAS where to find the data with one of the file-handling statements below. These will be explained as you need to use them. Examine the following table Statement Data is found...
- SET In a SAS data set
- INFILE In an external file
- CARDS
Program Statements
After reading an observation, SAS executes each program statement found in the DATA step. Program statements can include the following and many, many more:
IF-THEN/ELSE — to test conditions
DELETE/OUTPUT — to control writing to the data set being created
Once all the program statements are executed, unless SAS is told otherwise with a specific programstatement, the observation is written to the SAS data set named in the DATA statement. Then the execution phase begins all over again. It continues to process observations until it looks at the input data and sees an end-of-file indicator.