Let’s say you have data in long format:

Alt Text

The following dataset is in the sashelp directory, and its name is iris. It is a common example dataset across languages. No matter which edition of SAS you are using, you should have access to this dataset.

You want to consolidate all SepalLength values for a given species into a single field.

This transformation is similar to long-to-wide transformations, except we are combining values into a single field instead of creating separate variables for each value.

If you want to create multiple variables from multiple values, I would recommend PROC TRANSPOSE or a DATA step with an ARRAY statement.

One method is using a DO-UNTIL loop.

data want;
	do until (last.species); /* #1 */
		set sashelp.iris; /* #2 */
		by species; /* #3 */
		length all_lengths $300.; /* #4 */
		retain all_lengths; /* #5 */
			if first.species then all_lengths = sepallength; /* #6 */
				else all_lengths = catx(', ', all_lengths, sepallength); /* #7 */
			if last.species then output; /* #8 */
	end; /* #9 */
run;

This dataset is already sorted by species. If your data are not sorted, you will need to sort it via PROC SORT or a PROC SQL query.

  1. The DO UNTIL statement is going to process until a condition is satisfied.
    • The condition that we’re using here is last.species. We’re telling SAS to continue to process until last.species is equal to 1.
  2. We’re reading the sashelp.iris data into the PDV.
  3. We initiate BY-GROUP processing. This is required since we’re using the DO UNTIL statement.
    • This tells SAS to create automatic variables first.species and last.species and set each respective record (first and last) to 1. If it isn’t the first or last record, SAS marks the respective field as 0.
  4. Tell SAS how long you want the new variable, all_lengths to be. You can always modify this later, so I set it to a long length to ensure we capture everything.
  5. RETAIN will hold the value of all_lengths across observations for a given species.
  6. If first.species = 1 (implied with first.species), set all_lengths to sepallength.
  7. Otherwise, if first.species = 0, all_lengths should combine what already exists in all_lengths with the current record’s value of sepallength.
    • We’re combining with the CATX function, which adds a delimiter (in this case, , ) between each value.
  8. If last.species = 1 (again, implied), output the record to the want dataset.
  9. The END for the DO UNTIL statement returns to the top and reads the next group of species into the PDV.

This results in the following:

Alt Text