Overview
A post came across the forums recently where someone needed their results transposed.
The hitch - there were multiple records for their key or identifying variables.
They wanted it down to one line for each.
The problem
Here was the dataset:
data have;
input ID_PRSN :$3. ID_CPS :$3. CD_ALLEGATION CD_FINDING;
datalines;
001 001 1 9
001 001 3 1
001 002 1 1
002 003 3 2
002 003 4 1
002 003 5 9
002 004 1 1
002 004 2 1
002 005 1 1
003 006 4 9
;
run;
A simple PROC TRANSPOSE, right?
proc transpose
data = have out = want;
by id_prsn id_cps;
var cd_allegation cd_finding;
run;

My Solution
You can use PROC TRANSPOSE twice to solve the issue above (plus a DATA step to make a counter variable), but why not get it done in two PROCS?
PROC SUMMARY is a little gem for this, and it’s kind of difficult to find.
It’s such a gem that it helped identify a coding issue in a SAS published book.
Step 1: Determine how many fields you need
First, I needed to create a counter flag - this flag would represent the number of records under a given id_prsn and id_cps.
Whatever the largest value was for counter was going to be the number of variables outputted.
proc sql noprint;
select
max(counter)
into: counter trimmed
from
( select
count(*) as counter
from
have
group by
id_prsn, id_cps
);
quit;
This may look a little weird, but let me point you in the right direction for interpretation.
- This query does not start at the
SELECTstatement - it starts at theFROMclause. - Look at the
FROMclause - it involves a subquery. This subquery is returning a column of the highest values byid_prsnandid_cps. - Now, go back to the opening
SELECTstatement. We are selecting the maximum value ofcounterfrom that subquery.- Additionally, we are selecting that maximum value and placing it into a macro variable called
counter.Note: The
trimmedoption is needed because this process returns a string with whitespace.
- Additionally, we are selecting that maximum value and placing it into a macro variable called
What is the value?
%put &counter.;

This means we have up to three variables for a given individual. We will need three variables each for CD_ALLEGATION and CD_FINDING to hold all values for all individuals.
Step 2: Apply your macro variable with PROC SUMMARY
Now for some fun, arcane syntax.
If you read the documentation, you will see that its similar to the MEANS procedure. In fact, I just tested whether this code ran in PROC MEANS and, go figure, it worked!
Nonetheless:
proc summary data = have nway; /* #1 */
class id_prsn id_cps; /* #2 */
output out = want (drop = _:) /* #3 */
idgroup(out[&counter.](cd_allegation cd_finding) = algtn find); /* #4 */
run;
NWAYis telling SAS to give me all combinations for eachid_prsnandid_cps. Don’t give me justid_prsnorid_cps- we want it all.- What are we classifying or grouping the variables by? Self-evident:
id_prsnandid_cps. - Whenever you want to output a dataset with
MEANSorSUMMARY, you specifyOUTPUT OUT = <your-new-dataset>.- Here, I’m creating a dataset named
wantand I am usingDROPto prevent automatic variables (_TYPE_,_FREQ_) from being attached to the dataset.
- Here, I’m creating a dataset named
- Now the fun:
IDGROUPallows you to make output variables with the same nameOUTis referring to our output dataset,want.[&counter.]is our macro variable from thePROC SQLstep.- This is telling
PROC SUMMARYthat our output dataset should have three variables.
- This is telling
- But what variables? Remember
CD_ALLEGATIONandCD_FINDING? Those are what we wanted to be transposed. So we follow the macro variable in brackets with the variables we want three of in our output dataset,CD_ALLEGATIONandCD_FINDING.Note the parentheses surrounding the variables.
- Finally, we separate our existing variables and our new variables with an
=sign.- On the left we have our existing variables, but on the right we have our new variables,
algtnandfind. They will be appended with the sequence they are processed (e.g.,algtn_1,algtn_2, etc.)
- On the left we have our existing variables, but on the right we have our new variables,