We often need to flag whether a patient was diagnosed with a given medical condition.
There are many ways to do this, and I’m only showing a couple ways that are generally effective.
First, here is some example data that largely mimic the format of claims data.
data work.patients;
infile datalines dsd truncover;
input enrolid:3. dx1:$7. dx2:$7. dx3:$7. dx4:$7.;
datalines;
2,A821,E133519,F1194,E133299
4,E75241,E731,E088,E083291
4,E75241,A5139,E088,E083291
5,E0810,A5139,E133291,A710
5,E0810,A360,E133291,A710
6,A3790,A4289,A5041,F19121
7,F810,A4289,A5041,A300
7,F810,E083291,A5041,A300
7,F810,E083291,A5041,E103591
8,E531,A4150,A5041,E103591
8,E531,A4150,A5433,F0630
9,A5275,F04,A5433,F0630
10,E853,A231,E781,A5139
11,F554,A5423,E13641,A5139
12,E848,A5423,E13641,E7881
12,E848,E75240,E13641,E7881
13,A663,E000,F19980,A5423
15,F11129,A3790,A5433,E75240
15,F11129,A3790,A5433,E731
16,A3700,A3790,A5433,E731
16,A3700,E133519,E13641,E0939
17,A5139,E133519,E13641,E0939
18,E030,F524,A231,A1859
20,A5002,E7500,A3952,F18951
20,A5002,F13280,A839,F18951
21,F524,F13280,A839,E089
22,F11129,E1137X9,A5041,E103219
25,A4289,E71313,F1293,A218
25,A4289,E71313,F04,A218
27,A5139,E030,A5209,E103219
27,A5139,F0781,A5209,E103219
30,E28319,F13232,A5209,F812
31,E08311,F13232,A5209,F1821
32,F13280,A5139,E731,A5209
32,F13280,A5402,A5275,A218
33,E103591,A5402,A5275,A218
33,E103591,F554,A5275,A218
33,E103591,F554,E093499,F18951
34,E13641,F521,E093499,F18951
35,E1137X9,F810,E133519,E093541
36,A5002,F554,E133519,F11288
38,A523,F15980,F13181,A5139
44,E853,E731,A5423,F1194
45,E0939,F1194,E0939,E133299
45,E0939,F521,E0939,E133299
46,A300,F521,A4150,F19121
46,A300,A199,E103591,E211
47,F13181,A199,E103591,E211
47,F13181,E58,E030,F19980
49,A3700,E853,F19980,F0781
;
If you saw my last post, I discussed how you can convert large datasets into formats to make useful labels. That post is here.
Here’s what the above dataset is showing us:
enrolid | dx1 | dx2 | dx3 | dx4 |
---|---|---|---|---|
2 | Urban rabies | Other specified diabetes mellitus with proliferative diabetic retinopathy with macular edema | Opioid use | Other specified diabetes mellitus with mild nonproliferative diabetic retinopathy without macular edema |
4 | Niemann-Pick disease type B | Secondary lactase deficiency | Diabetes mellitus due to underlying condition with unspecified complications | Diabetes mellitus due to underlying condition with mild nonproliferative diabetic retinopathy without macular edema |
4 | Niemann-Pick disease type B | Other secondary syphilis of skin | Diabetes mellitus due to underlying condition with unspecified complications | Diabetes mellitus due to underlying condition with mild nonproliferative diabetic retinopathy without macular edema |
5 | Diabetes mellitus due to underlying condition with ketoacidosis without coma | Other secondary syphilis of skin | Other specified diabetes mellitus with mild nonproliferative diabetic retinopathy without macular edema | Initial stage of trachoma |
5 | Diabetes mellitus due to underlying condition with ketoacidosis without coma | Pharyngeal diphtheria | Other specified diabetes mellitus with mild nonproliferative diabetic retinopathy without macular edema | Initial stage of trachoma |
6 | Whooping cough | Other forms of actinomycosis | Late congenital syphilitic meningitis | Other psychoactive substance abuse with intoxication delirium |
7 | Specific reading disorder | Other forms of actinomycosis | Late congenital syphilitic meningitis | Indeterminate leprosy |
7 | Specific reading disorder | Diabetes mellitus due to underlying condition with mild nonproliferative diabetic retinopathy without macular edema | Late congenital syphilitic meningitis | Indeterminate leprosy |
7 | Specific reading disorder | Diabetes mellitus due to underlying condition with mild nonproliferative diabetic retinopathy without macular edema | Late congenital syphilitic meningitis | Type 1 diabetes mellitus with proliferative diabetic retinopathy without macular edema |
8 | Pyridoxine deficiency | Gram-negative sepsis | Late congenital syphilitic meningitis | Type 1 diabetes mellitus with proliferative diabetic retinopathy without macular edema |
8 | Pyridoxine deficiency | Gram-negative sepsis | Gonococcal keratitis | Mood disorder due to known physiological condition |
9 | Syphilis of kidney and ureter | Amnestic disorder due to known physiological condition | Gonococcal keratitis | Mood disorder due to known physiological condition |
10 | Secondary systemic amyloidosis | Brucellosis due to Brucella abortus | Pure hyperglyceridemia | Other secondary syphilis of skin |
11 | Abuse of vitamins | Gonococcal infection of other male genital organs | Other specified diabetes mellitus with hypoglycemia with coma | Other secondary syphilis of skin |
12 | Cystic fibrosis with other manifestations | Gonococcal infection of other male genital organs | Other specified diabetes mellitus with hypoglycemia with coma | Lipoid dermatoarthritis |
12 | Cystic fibrosis with other manifestations | Niemann-Pick disease type A | Other specified diabetes mellitus with hypoglycemia with coma | Lipoid dermatoarthritis |
13 | Hyperkeratosis of yaws | Congenital iodine-deficiency syndrome | Other psychoactive substance use | Gonococcal infection of other male genital organs |
15 | Opioid abuse with intoxication | Whooping cough | Gonococcal keratitis | Niemann-Pick disease type A |
15 | Opioid abuse with intoxication | Whooping cough | Gonococcal keratitis | Secondary lactase deficiency |
16 | Whooping cough due to Bordetella pertussis without pneumonia | Whooping cough | Gonococcal keratitis | Secondary lactase deficiency |
16 | Whooping cough due to Bordetella pertussis without pneumonia | Other specified diabetes mellitus with proliferative diabetic retinopathy with macular edema | Other specified diabetes mellitus with hypoglycemia with coma | Drug or chemical induced diabetes mellitus with other diabetic ophthalmic complication |
17 | Other secondary syphilis of skin | Other specified diabetes mellitus with proliferative diabetic retinopathy with macular edema | Other specified diabetes mellitus with hypoglycemia with coma | Drug or chemical induced diabetes mellitus with other diabetic ophthalmic complication |
18 | Congenital hypothyroidism with diffuse goiter | Premature ejaculation | Brucellosis due to Brucella abortus | Other tuberculosis of eye |
20 | Early congenital syphilitic osteochondropathy | GM2 gangliosidosis | Meningococcal myocarditis | Inhalant use |
20 | Early congenital syphilitic osteochondropathy | Sedative | Mosquito-borne viral encephalitis | Inhalant use |
21 | Premature ejaculation | Sedative | Mosquito-borne viral encephalitis | Diabetes mellitus due to underlying condition without complications |
22 | Opioid abuse with intoxication | Type 2 diabetes mellitus with diabetic macular edema | Late congenital syphilitic meningitis | Type 1 diabetes mellitus with mild nonproliferative diabetic retinopathy with macular edema |
25 | Other forms of actinomycosis | Glutaric aciduria type II | Cannabis use | Other forms of tularemia |
25 | Other forms of actinomycosis | Glutaric aciduria type II | Amnestic disorder due to known physiological condition | Other forms of tularemia |
27 | Other secondary syphilis of skin | Congenital hypothyroidism with diffuse goiter | Other cardiovascular syphilis | Type 1 diabetes mellitus with mild nonproliferative diabetic retinopathy with macular edema |
27 | Other secondary syphilis of skin | Postconcussional syndrome | Other cardiovascular syphilis | Type 1 diabetes mellitus with mild nonproliferative diabetic retinopathy with macular edema |
30 | Asymptomatic premature menopause | Sedative | Other cardiovascular syphilis | Mathematics disorder |
31 | Diabetes mellitus due to underlying condition with unspecified diabetic retinopathy with macular edema | Sedative | Other cardiovascular syphilis | Inhalant dependence |
32 | Sedative | Other secondary syphilis of skin | Secondary lactase deficiency | Other cardiovascular syphilis |
32 | Sedative | Gonococcal vulvovaginitis | Syphilis of kidney and ureter | Other forms of tularemia |
33 | Type 1 diabetes mellitus with proliferative diabetic retinopathy without macular edema | Gonococcal vulvovaginitis | Syphilis of kidney and ureter | Other forms of tularemia |
33 | Type 1 diabetes mellitus with proliferative diabetic retinopathy without macular edema | Abuse of vitamins | Syphilis of kidney and ureter | Other forms of tularemia |
33 | Type 1 diabetes mellitus with proliferative diabetic retinopathy without macular edema | Abuse of vitamins | Drug or chemical induced diabetes mellitus with severe nonproliferative diabetic retinopathy without macular edema | Inhalant use |
34 | Other specified diabetes mellitus with hypoglycemia with coma | Sexual aversion disorder | Drug or chemical induced diabetes mellitus with severe nonproliferative diabetic retinopathy without macular edema | Inhalant use |
35 | Type 2 diabetes mellitus with diabetic macular edema | Specific reading disorder | Other specified diabetes mellitus with proliferative diabetic retinopathy with macular edema | Drug or chemical induced diabetes mellitus with proliferative diabetic retinopathy with combined traction retinal detachment and rhegmatogenous retinal detachment |
36 | Early congenital syphilitic osteochondropathy | Abuse of vitamins | Other specified diabetes mellitus with proliferative diabetic retinopathy with macular edema | Opioid dependence with other opioid-induced disorder |
38 | Neurosyphilis | Other stimulant use | Sedative | Other secondary syphilis of skin |
44 | Secondary systemic amyloidosis | Secondary lactase deficiency | Gonococcal infection of other male genital organs | Opioid use |
45 | Drug or chemical induced diabetes mellitus with other diabetic ophthalmic complication | Opioid use | Drug or chemical induced diabetes mellitus with other diabetic ophthalmic complication | Other specified diabetes mellitus with mild nonproliferative diabetic retinopathy without macular edema |
45 | Drug or chemical induced diabetes mellitus with other diabetic ophthalmic complication | Sexual aversion disorder | Drug or chemical induced diabetes mellitus with other diabetic ophthalmic complication | Other specified diabetes mellitus with mild nonproliferative diabetic retinopathy without macular edema |
46 | Indeterminate leprosy | Sexual aversion disorder | Gram-negative sepsis | Other psychoactive substance abuse with intoxication delirium |
46 | Indeterminate leprosy | Miliary tuberculosis | Type 1 diabetes mellitus with proliferative diabetic retinopathy without macular edema | Secondary hyperparathyroidism |
47 | Sedative | Miliary tuberculosis | Type 1 diabetes mellitus with proliferative diabetic retinopathy without macular edema | Secondary hyperparathyroidism |
47 | Sedative | Dietary calcium deficiency | Congenital hypothyroidism with diffuse goiter | Other psychoactive substance use |
49 | Whooping cough due to Bordetella pertussis without pneumonia | Secondary systemic amyloidosis | Other psychoactive substance use | Postconcussional syndrome |
It’s obviously a lot of information, and that’s the last time I will show it with labels.
Also, these diagnoses were randomly selected, so pardon any random/weird combinations.
Flagging diagnoses: the naive method
Let’s say that you need to flag any individual who has a diagnosis code related to opioid use. We could use the following code:
data patients_data_naive;
set work.patients;
length opioid_flag 3.;
if dx1 = "F1194" or dx2 = "F1194" or dx3 = "F1194" or dx4 = "F1194" then opioid_flag = 1;
else opioid_flag = 0;
run;
This is obviously DATA
step code, and it uses IF-THEN-ELSE
processing to flag whether someone has an opioid use diagnosis code (F1194
).
We could also do it using PROC SQL
and the CASE WHEN
statement:
proc sql;
create table work.patients_sql_naive as
select
*,
case
when dx1 = "F1194" or dx2 = "F1194" or dx3 = "F1194" or dx4 = "F1194" then 1
else 0
end as opioid_flag length = 3
from
work.patients;
quit;
Both of these methods produce the following output:
enrolid | dx1 | dx2 | dx3 | dx4 | opioid_flag |
---|---|---|---|---|---|
2 | A821 | E133519 | F1194 | E133299 | 1 |
4 | E75241 | E731 | E088 | E083291 | 0 |
4 | E75241 | A5139 | E088 | E083291 | 0 |
5 | E0810 | A5139 | E133291 | A710 | 0 |
5 | E0810 | A360 | E133291 | A710 | 0 |
6 | A3790 | A4289 | A5041 | F19121 | 0 |
7 | F810 | A4289 | A5041 | A300 | 0 |
7 | F810 | E083291 | A5041 | A300 | 0 |
7 | F810 | E083291 | A5041 | E103591 | 0 |
8 | E531 | A4150 | A5041 | E103591 | 0 |
8 | E531 | A4150 | A5433 | F0630 | 0 |
9 | A5275 | F04 | A5433 | F0630 | 0 |
10 | E853 | A231 | E781 | A5139 | 0 |
11 | F554 | A5423 | E13641 | A5139 | 0 |
12 | E848 | A5423 | E13641 | E7881 | 0 |
12 | E848 | E75240 | E13641 | E7881 | 0 |
13 | A663 | E000 | F19980 | A5423 | 0 |
15 | F11129 | A3790 | A5433 | E75240 | 0 |
15 | F11129 | A3790 | A5433 | E731 | 0 |
16 | A3700 | A3790 | A5433 | E731 | 0 |
16 | A3700 | E133519 | E13641 | E0939 | 0 |
17 | A5139 | E133519 | E13641 | E0939 | 0 |
18 | E030 | F524 | A231 | A1859 | 0 |
20 | A5002 | E7500 | A3952 | F18951 | 0 |
20 | A5002 | F13280 | A839 | F18951 | 0 |
21 | F524 | F13280 | A839 | E089 | 0 |
22 | F11129 | E1137X9 | A5041 | E103219 | 0 |
25 | A4289 | E71313 | F1293 | A218 | 0 |
25 | A4289 | E71313 | F04 | A218 | 0 |
27 | A5139 | E030 | A5209 | E103219 | 0 |
27 | A5139 | F0781 | A5209 | E103219 | 0 |
30 | E28319 | F13232 | A5209 | F812 | 0 |
31 | E08311 | F13232 | A5209 | F1821 | 0 |
32 | F13280 | A5139 | E731 | A5209 | 0 |
32 | F13280 | A5402 | A5275 | A218 | 0 |
33 | E103591 | A5402 | A5275 | A218 | 0 |
33 | E103591 | F554 | A5275 | A218 | 0 |
33 | E103591 | F554 | E093499 | F18951 | 0 |
34 | E13641 | F521 | E093499 | F18951 | 0 |
35 | E1137X9 | F810 | E133519 | E093541 | 0 |
36 | A5002 | F554 | E133519 | F11288 | 0 |
38 | A523 | F15980 | F13181 | A5139 | 0 |
44 | E853 | E731 | A5423 | F1194 | 1 |
45 | E0939 | F1194 | E0939 | E133299 | 1 |
45 | E0939 | F521 | E0939 | E133299 | 0 |
46 | A300 | F521 | A4150 | F19121 | 0 |
46 | A300 | A199 | E103591 | E211 | 0 |
47 | F13181 | A199 | E103591 | E211 | 0 |
47 | F13181 | E58 | E030 | F19980 | 0 |
49 | A3700 | E853 | F19980 | F0781 | 0 |
Arrays in SAS
One issue with the naive method is that it requires you to explicitly type out each variable. This isn’t helpful if you have up to fifty diagnosis/procedure codes in your data sets.
This is where arrays are extremely helpful. Arrays allow you to reference all of these variables with a single keyword. Below is an example of an array statement:
array _icd [4] dx1 dx2 dx3 dx4
Now, you may be saying that this isn’t helpful – you’re still typing out all the variables! Let’s break it down.
- We have the
ARRAY
statment itself. - The name of your array,
_icd
. I prefer to begin myARRAY
names with an underscore. Think of this as a shortcut or nickname for all the variables you want to reference. - The subscript. This identifies how many variables/elements are in your
ARRAY
. - Finally, the variables that are in your
ARRAY
.
An array with an explicit subscript and explict variables
One way to define your array is through explicit subscripts and variables, much like the example above. Below is a full example:
data work.patients_expl_array;
set work.patients;
length opioid_flag 3.; /* #1 */
array _icd [4] dx1 - dx4; /* #2 */
opioid_flag = 0; /* #3 */
do i = 1 to 4; /* #4 */
if _icd[i] = "F1194" then opioid_flag = 1; /* #5 */
end; /* #6 */
drop i; /* #7 */
run;
Results:
enrolid | dx1 | dx2 | dx3 | dx4 | opioid_flag |
---|---|---|---|---|---|
2 | A821 | E133519 | F1194 | E133299 | 1 |
4 | E75241 | E731 | E088 | E083291 | 0 |
4 | E75241 | A5139 | E088 | E083291 | 0 |
5 | E0810 | A5139 | E133291 | A710 | 0 |
5 | E0810 | A360 | E133291 | A710 | 0 |
6 | A3790 | A4289 | A5041 | F19121 | 0 |
7 | F810 | A4289 | A5041 | A300 | 0 |
7 | F810 | E083291 | A5041 | A300 | 0 |
7 | F810 | E083291 | A5041 | E103591 | 0 |
8 | E531 | A4150 | A5041 | E103591 | 0 |
8 | E531 | A4150 | A5433 | F0630 | 0 |
9 | A5275 | F04 | A5433 | F0630 | 0 |
10 | E853 | A231 | E781 | A5139 | 0 |
11 | F554 | A5423 | E13641 | A5139 | 0 |
12 | E848 | A5423 | E13641 | E7881 | 0 |
12 | E848 | E75240 | E13641 | E7881 | 0 |
13 | A663 | E000 | F19980 | A5423 | 0 |
15 | F11129 | A3790 | A5433 | E75240 | 0 |
15 | F11129 | A3790 | A5433 | E731 | 0 |
16 | A3700 | A3790 | A5433 | E731 | 0 |
16 | A3700 | E133519 | E13641 | E0939 | 0 |
17 | A5139 | E133519 | E13641 | E0939 | 0 |
18 | E030 | F524 | A231 | A1859 | 0 |
20 | A5002 | E7500 | A3952 | F18951 | 0 |
20 | A5002 | F13280 | A839 | F18951 | 0 |
21 | F524 | F13280 | A839 | E089 | 0 |
22 | F11129 | E1137X9 | A5041 | E103219 | 0 |
25 | A4289 | E71313 | F1293 | A218 | 0 |
25 | A4289 | E71313 | F04 | A218 | 0 |
27 | A5139 | E030 | A5209 | E103219 | 0 |
27 | A5139 | F0781 | A5209 | E103219 | 0 |
30 | E28319 | F13232 | A5209 | F812 | 0 |
31 | E08311 | F13232 | A5209 | F1821 | 0 |
32 | F13280 | A5139 | E731 | A5209 | 0 |
32 | F13280 | A5402 | A5275 | A218 | 0 |
33 | E103591 | A5402 | A5275 | A218 | 0 |
33 | E103591 | F554 | A5275 | A218 | 0 |
33 | E103591 | F554 | E093499 | F18951 | 0 |
34 | E13641 | F521 | E093499 | F18951 | 0 |
35 | E1137X9 | F810 | E133519 | E093541 | 0 |
36 | A5002 | F554 | E133519 | F11288 | 0 |
38 | A523 | F15980 | F13181 | A5139 | 0 |
44 | E853 | E731 | A5423 | F1194 | 1 |
45 | E0939 | F1194 | E0939 | E133299 | 1 |
45 | E0939 | F521 | E0939 | E133299 | 0 |
46 | A300 | F521 | A4150 | F19121 | 0 |
46 | A300 | A199 | E103591 | E211 | 0 |
47 | F13181 | A199 | E103591 | E211 | 0 |
47 | F13181 | E58 | E030 | F19980 | 0 |
49 | A3700 | E853 | F19980 | F0781 | 0 |
Let’s break this down:
- It’s always good practice to set the length of numeric variables to the smallest number of bytes (3) when you’re creating a dichotomous flag. SAS will default to 8 if you do not set it here.
- Here’s our array statement. Instead of typing out each variable (
dx1
,dx2
, etc.), I use a hyphen to specify the range of variables betweendx1
anddx4
. - I set the default value of
opioid_flag
to 0. This means each individuak will have a value of 0 foropioid_flag
unless they have the diagnosis code for opioid use. - To use our
ARRAY
statement, we have to use theDO
statement. This statement tells SAS that for each observation/record,DO
or execute the following code from 1 to 4 times.- Also, notice the
i
here. We’re essentially creating a variable,i
, that takes on the value based on theDO
statement or loop. Each time theDO
loop iterates, this value will increase by 1. Once the value exceedsi
(here, it would be 5), theDO
loop or statement stops and proceeds to the next observation.
- Also, notice the
- Here’s where we use our
ARRAY
statement. Remember, ourARRAY
name is basically a shortcut or nickname to refer to all of ourdx
variables.- I’m going to focus right on the
_icd[i]
part since that often trips people up and makes them avoid arrays. - Since we are in a
DO
loop,_icd[i]
is first going to look at our value ofi
. If we are in the first iteration of theDO
loop for a single observation,i
will be equal to 1. Accordingly,_icd[1]
will resolve todx1
. - We can then read it like our naive
DATA
step orPROC SQL
query – that is, ifdx1
is equal toF1194
, thenopioid_flag
is equal to 1.
- I’m going to focus right on the
- For every
DO
statement, you must have anEND
statement. SAS will reach this statement, and ifi
has not exceeded the end value in ourDO
loop, it will return to the top, incrementi
by 1 and repeat the process outlined in point5
above. - Finally, we’re going to use the
drop
statement to remove the variablei
from theworkpatients_expl_array
data set.- If you choose to keep
i
in your data set, you will see that it holds a value of 5 for each observation. This is exactly what I was talking about in step 6.
- If you choose to keep
An array with an implicit subscript and implicit variables
The above code can become problematic if you’re not careful. Let’s say that you have 5 diagnosis codes, but you leave the subscript as 4 – you’re not going to examine that fifth diagnosis code, and your results may be problematic.
This is where implicit subscripts and implicit variables become helpful. They’re basically shorthand ways of letting the computer do the counting and grouping for you. What do I mean? You will probably see that most of your cohort members use this kind of ARRAY
.
data work.patients_impl_array;
set work.patients;
length opioid_flag 3.;
array _icd [*] dx:; /* #1 */
opioid_flag = 0;
do i = 1 to dim(_icd); /* #2 */
if _icd[i] = "F1194" then opioid_flag = 1;
leave; /* #3 */
end;
drop i;
run;
Since most of the steps are the same, I only commented out sections that I did not explain previously:
- Here, we are doing the same thing as above. We use the
ARRAY
statement and ourARRAY
name (_icd
).- Instead of explicitly defining how many variables or elements exist in our array, I use an asterisk
*
to tell SAS to count the number of variables included in myARRAY
. - So what variables are in my
ARRAY
? Anything beginning withdx
. The colon (:
) tells SAS to include all variables that begin with this prefix.
- Instead of explicitly defining how many variables or elements exist in our array, I use an asterisk
- Again, this is a
DO
loop. The only difference is that we are using theDIM
function.- We are telling SAS to create that
i
variable that takes on a value from 1 to however many variables are in theARRAY
. TheDIM
function essentially acts as an upper bound that is defined by however many variables SAS found with thedx
prefix.
- We are telling SAS to create that
- Here, I added a
leave
statement. This will save you a lot of time if you’re working with large files, which is obviously the case with claims data.- This tells SAS to leave the
DO
loop if theIF-THEN
statement resolves to 1.
- This tells SAS to leave the
I’m not going to show the results since they are the same.
Arrays with multiple conditions
This doesn’t require much explanation. It builds on the same thing as above, except we use the IN
operator.
I’m flagging the following conditions: “Diabetes mellitus due to underlying condition with unspecified diabetic retinopathy with macular edema” and “Diabetes mellitus due to underlying condition with unspecified complications”.
data work.patients_array_mult_dx;
set work.patients;
length diabetes_flag 3.;
array _icd [*] dx:;
diabetes_flag = 0;
do i = 1 to dim(_icd);
if _icd[i] in ("E08311", "E088") then diabetes_flag = 1;
end;
drop i;
run;
Results:
enrolid | dx1 | dx2 | dx3 | dx4 | diabetes_flag |
---|---|---|---|---|---|
2 | A821 | E133519 | F1194 | E133299 | 0 |
4 | E75241 | E731 | E088 | E083291 | 1 |
4 | E75241 | A5139 | E088 | E083291 | 1 |
5 | E0810 | A5139 | E133291 | A710 | 0 |
5 | E0810 | A360 | E133291 | A710 | 0 |
6 | A3790 | A4289 | A5041 | F19121 | 0 |
7 | F810 | A4289 | A5041 | A300 | 0 |
7 | F810 | E083291 | A5041 | A300 | 0 |
7 | F810 | E083291 | A5041 | E103591 | 0 |
8 | E531 | A4150 | A5041 | E103591 | 0 |
8 | E531 | A4150 | A5433 | F0630 | 0 |
9 | A5275 | F04 | A5433 | F0630 | 0 |
10 | E853 | A231 | E781 | A5139 | 0 |
11 | F554 | A5423 | E13641 | A5139 | 0 |
12 | E848 | A5423 | E13641 | E7881 | 0 |
12 | E848 | E75240 | E13641 | E7881 | 0 |
13 | A663 | E000 | F19980 | A5423 | 0 |
15 | F11129 | A3790 | A5433 | E75240 | 0 |
15 | F11129 | A3790 | A5433 | E731 | 0 |
16 | A3700 | A3790 | A5433 | E731 | 0 |
16 | A3700 | E133519 | E13641 | E0939 | 0 |
17 | A5139 | E133519 | E13641 | E0939 | 0 |
18 | E030 | F524 | A231 | A1859 | 0 |
20 | A5002 | E7500 | A3952 | F18951 | 0 |
20 | A5002 | F13280 | A839 | F18951 | 0 |
21 | F524 | F13280 | A839 | E089 | 0 |
22 | F11129 | E1137X9 | A5041 | E103219 | 0 |
25 | A4289 | E71313 | F1293 | A218 | 0 |
25 | A4289 | E71313 | F04 | A218 | 0 |
27 | A5139 | E030 | A5209 | E103219 | 0 |
27 | A5139 | F0781 | A5209 | E103219 | 0 |
30 | E28319 | F13232 | A5209 | F812 | 0 |
31 | E08311 | F13232 | A5209 | F1821 | 1 |
32 | F13280 | A5139 | E731 | A5209 | 0 |
32 | F13280 | A5402 | A5275 | A218 | 0 |
33 | E103591 | A5402 | A5275 | A218 | 0 |
33 | E103591 | F554 | A5275 | A218 | 0 |
33 | E103591 | F554 | E093499 | F18951 | 0 |
34 | E13641 | F521 | E093499 | F18951 | 0 |
35 | E1137X9 | F810 | E133519 | E093541 | 0 |
36 | A5002 | F554 | E133519 | F11288 | 0 |
38 | A523 | F15980 | F13181 | A5139 | 0 |
44 | E853 | E731 | A5423 | F1194 | 0 |
45 | E0939 | F1194 | E0939 | E133299 | 0 |
45 | E0939 | F521 | E0939 | E133299 | 0 |
46 | A300 | F521 | A4150 | F19121 | 0 |
46 | A300 | A199 | E103591 | E211 | 0 |
47 | F13181 | A199 | E103591 | E211 | 0 |
47 | F13181 | E58 | E030 | F19980 | 0 |
49 | A3700 | E853 | F19980 | F0781 | 0 |
Arrays with the IN: operator
The IN
operator is great, but we’re only grabbing two diagnoses within the E08
category that deal with diabetes. An awesome, short-hand technique to access all codes with a certain prefix is the IN
with the colon modifier (IN:
).
It’s not very well-documented in the SAS documentation, but it is available in a paper here.
data work.patients_array_mult_dx_in;
set work.patients;
length diabetes_flag 3.;
array _icd [*] dx:;
diabetes_flag = 0;
do i = 1 to dim(_icd);
if _icd[i] in: ("E08") then diabetes_flag = 1;
end;
drop i;
run;
Using a
leave
statement withIN:
doesn’t flag records properly. I’m not quite sure why, but this was an issue for me on SAS Studio (free) and my personal, local version of SAS 9.4.
Results:
enrolid | dx1 | dx2 | dx3 | dx4 | diabetes_flag |
---|---|---|---|---|---|
2 | A821 | E133519 | F1194 | E133299 | 0 |
4 | E75241 | E731 | E088 | E083291 | 1 |
4 | E75241 | A5139 | E088 | E083291 | 1 |
5 | E0810 | A5139 | E133291 | A710 | 1 |
5 | E0810 | A360 | E133291 | A710 | 1 |
6 | A3790 | A4289 | A5041 | F19121 | 0 |
7 | F810 | A4289 | A5041 | A300 | 0 |
7 | F810 | E083291 | A5041 | A300 | 1 |
7 | F810 | E083291 | A5041 | E103591 | 1 |
8 | E531 | A4150 | A5041 | E103591 | 0 |
8 | E531 | A4150 | A5433 | F0630 | 0 |
9 | A5275 | F04 | A5433 | F0630 | 0 |
10 | E853 | A231 | E781 | A5139 | 0 |
11 | F554 | A5423 | E13641 | A5139 | 0 |
12 | E848 | A5423 | E13641 | E7881 | 0 |
12 | E848 | E75240 | E13641 | E7881 | 0 |
13 | A663 | E000 | F19980 | A5423 | 0 |
15 | F11129 | A3790 | A5433 | E75240 | 0 |
15 | F11129 | A3790 | A5433 | E731 | 0 |
16 | A3700 | A3790 | A5433 | E731 | 0 |
16 | A3700 | E133519 | E13641 | E0939 | 0 |
17 | A5139 | E133519 | E13641 | E0939 | 0 |
18 | E030 | F524 | A231 | A1859 | 0 |
20 | A5002 | E7500 | A3952 | F18951 | 0 |
20 | A5002 | F13280 | A839 | F18951 | 0 |
21 | F524 | F13280 | A839 | E089 | 1 |
22 | F11129 | E1137X9 | A5041 | E103219 | 0 |
25 | A4289 | E71313 | F1293 | A218 | 0 |
25 | A4289 | E71313 | F04 | A218 | 0 |
27 | A5139 | E030 | A5209 | E103219 | 0 |
27 | A5139 | F0781 | A5209 | E103219 | 0 |
30 | E28319 | F13232 | A5209 | F812 | 0 |
31 | E08311 | F13232 | A5209 | F1821 | 1 |
32 | F13280 | A5139 | E731 | A5209 | 0 |
32 | F13280 | A5402 | A5275 | A218 | 0 |
33 | E103591 | A5402 | A5275 | A218 | 0 |
33 | E103591 | F554 | A5275 | A218 | 0 |
33 | E103591 | F554 | E093499 | F18951 | 0 |
34 | E13641 | F521 | E093499 | F18951 | 0 |
35 | E1137X9 | F810 | E133519 | E093541 | 0 |
36 | A5002 | F554 | E133519 | F11288 | 0 |
38 | A523 | F15980 | F13181 | A5139 | 0 |
44 | E853 | E731 | A5423 | F1194 | 0 |
45 | E0939 | F1194 | E0939 | E133299 | 0 |
45 | E0939 | F521 | E0939 | E133299 | 0 |
46 | A300 | F521 | A4150 | F19121 | 0 |
46 | A300 | A199 | E103591 | E211 | 0 |
47 | F13181 | A199 | E103591 | E211 | 0 |
47 | F13181 | E58 | E030 | F19980 | 0 |
49 | A3700 | E853 | F19980 | F0781 | 0 |
Sorry that I didn’t do any gifs – I’m tired.
Let me know if you need anything clarified or if there are any issues in the post.
Program for download is here.