Data stage (in first CL script)

<< Click to Display Table of Contents >>

Navigation:  User guide > Script >

Data stage (in first CL script)

start data,

serial number in 1-5,

card characters 500,

!

ds $sex=$8/1,2,

 xt='Sex of respondent', 

 x='Sex\Male;Female', 

!

ds $q1=$9/1-3,e,

 xt='Rating of test product', 

 x='Rating\Good<v1>;Average<v0>;Poor<v-1>;No answer', 

!

dm $q2=$10/1,$11/1,$12/1,

 xt='Products eaten every day', 

 x='Every Day\Bread;Meat;Potatoes', 

!

di $q3=$13-14,

 xt='Cups of tea drunk daily', 

!

ds $sq3=$q3/0,1..5,>5,e,

 xt='Cups of tea drunk daily', 

 x='Tea\None;Up to 5;Over 5;Faulty response', 

!

dc $q4=$15-18,

 xt='Title', 

!

ds $sq4=$q4/'Mr ','Mrs ','Miss','Ms ',e,

 xt='Title', 

 x='Title\Mr.;Mrs.;Miss;Ms.;Other', 

!

finish data,

This stage defines 7 variables in all. The first ($SEX) is used as the breakdown on the tables. Four other variables are picked up to be used as the rows on the tables. The last two of these are not suitable to be used directly from the data so they are "recoded" into new variables for use in the tables stage. The texts are indented in the above CL script for ease of reading; remember that CL ignores all spaces unless they are within quotes.

$SEX

This creates a single valued variable with two responses. If data location 8 contains a 1 then the first response is set true, if data location 8 contains a 2 then the second response is set true. Data location 8 ($8) is the eighth character on each data line. For this example it contains a 1 if the respondent was male and a 2 if the respondent was female. This variable is given a title which is used at the top of the tables. It is also given some response text which is used above the columns on the tables. Note the column header followed by a \ ("Sex") which will be centred above the two columns.

$Q1

This defines a single valued variable with 3 responses from column 9 plus a "reject" response. The first three rows are given score values using <V> label controls which will be used to generate a mean score whenever this variable is used as the rows of a table.

$Q2

This defines a multi-valued variable with three responses. Each response is separate and a respondent may have any number of responses true. This information has been coded as a 1 in three consecutive data locations. If a respondent has eaten all three foods then all three locations will contain a 1.

$Q3

This defines an integer variable of 2 digits. The two data locations are called a field and will contain a number between 0 and 99. This variable can be used as the rows of a "value distribution table" but it is normal to band the values for tabulation.

$SQ3

This variable is not defined from the raw data but uses a previously defined variable ($Q3) to create four responses. The last response will be true if a faulty response (for example -1) is found in $Q3.

This variable could be defined directly from the raw data as:

ds $sq3=$13-14/0,1..5,>5,e,

but it is better practice to pick up raw data into a variable first and then recode it.

$Q4

This defines a character variable of four characters. A character variable is needed when the data is stored as letters instead of numbers. The expected contents of the variable are placed into a variable for tabulation

$SQ4

This variable has five responses which are picked up from variable $Q4.