Overview

A post came across the forums recently where someone needed their results transposed.

The hitch - there were multiple records for their key or identifying variables.

They wanted it down to one line for each.

The problem

Here was the dataset:

data have;
input ID_PRSN :$3. ID_CPS :$3. CD_ALLEGATION CD_FINDING;
datalines;
001	001	1 9
001	001	3 1
001	002	1 1
002	003	3 2
002	003	4 1
002	003	5 9
002	004	1 1
002	004	2 1
002	005	1 1
003	006	4 9
;
run;

Alt Text

A simple PROC TRANSPOSE, right?

proc transpose
    data = have out = want;
        by id_prsn id_cps;
        var cd_allegation cd_finding;
run;

Alt Text

Alt Text

My Solution

You can use PROC TRANSPOSE twice to solve the issue above (plus a DATA step to make a counter variable), but why not get it done in two PROCS?

PROC SUMMARY is a little gem for this, and it’s kind of difficult to find.

It’s such a gem that it helped identify a coding issue in a SAS published book.

Step 1: Determine how many fields you need

First, I needed to create a counter flag - this flag would represent the number of records under a given id_prsn and id_cps.

Whatever the largest value was for counter was going to be the number of variables outputted.

proc sql noprint;
	select
				max(counter)
					into: counter trimmed
	from
				(	select
								count(*) as counter
					from
								have
					group by
								id_prsn, id_cps
				);
quit;

This may look a little weird, but let me point you in the right direction for interpretation.

  1. This query does not start at the SELECT statement - it starts at the FROM clause.
  2. Look at the FROM clause - it involves a subquery. This subquery is returning a column of the highest values by id_prsn and id_cps.
  3. Now, go back to the opening SELECT statement. We are selecting the maximum value of counter from that subquery.
    • Additionally, we are selecting that maximum value and placing it into a macro variable called counter.

      Note: The trimmed option is needed because this process returns a string with whitespace.

What is the value?

%put &counter.;

Alt Text

This means we have up to three variables for a given individual. We will need three variables each for CD_ALLEGATION and CD_FINDING to hold all values for all individuals.

Step 2: Apply your macro variable with PROC SUMMARY

Now for some fun, arcane syntax.

If you read the documentation, you will see that its similar to the MEANS procedure. In fact, I just tested whether this code ran in PROC MEANS and, go figure, it worked!

Nonetheless:

proc summary data = have nway; /* #1 */
	class id_prsn id_cps; /* #2 */
	output out = want (drop = _:) /* #3 */
		idgroup(out[&counter.](cd_allegation cd_finding) = algtn find); /* #4 */
run;
  1. NWAY is telling SAS to give me all combinations for each id_prsn and id_cps. Don’t give me just id_prsn or id_cps - we want it all.
  2. What are we classifying or grouping the variables by? Self-evident: id_prsn and id_cps.
  3. Whenever you want to output a dataset with MEANS or SUMMARY, you specify OUTPUT OUT = <your-new-dataset>.
    • Here, I’m creating a dataset named want and I am using DROP to prevent automatic variables (_TYPE_, _FREQ_) from being attached to the dataset.
  4. Now the fun:
    • IDGROUP allows you to make output variables with the same name
    • OUT is referring to our output dataset, want.
    • [&counter.] is our macro variable from the PROC SQL step.
      • This is telling PROC SUMMARY that our output dataset should have three variables.
    • But what variables? Remember CD_ALLEGATION and CD_FINDING? Those are what we wanted to be transposed. So we follow the macro variable in brackets with the variables we want three of in our output dataset, CD_ALLEGATION and CD_FINDING.

      Note the parentheses surrounding the variables.

    • Finally, we separate our existing variables and our new variables with an = sign.
      • On the left we have our existing variables, but on the right we have our new variables, algtn and find. They will be appended with the sequence they are processed (e.g., algtn_1,algtn_2, etc.)

Alt Text