=lung pearson;
proc corr data
var age mealcal; run;
Correlation Analysis using SAS
Example: Lung Cancer Data
Data source: Loprinzi CL. Laurie JA. Wieand HS. Krook JE. Novotny PJ. Kugler JW. Bartel J. Law M. Bateman M. Klatt NE. et al. Prospective evaluation of prognostic variables from patient-completed questionnaires. North Central Cancer Treatment Group. Journal of Clinical Oncology. 12(3):601-7, 1994.
Survival in patients with advanced lung cancer from the North Central Cancer Treatment Group. Performance scores rate how well the patient can perform usual daily activities.
Overview
The CORR
procedure computes Pearson correlation coefficients, three nonparametric measures of association, and the probabilities associated with these statistics. The correlation statistics include the following:
Pearson product-moment correlation
Spearman rank-order correlation
Kendall’s tau-b coefficient
Hoeffding’s measure of dependence,
Pearson, Spearman, and Kendall partial correlation
This program works on the first three correlation coefficients.
Missing Values
PROC CORR excludes observations with missing values in the WEIGHT and FREQ variables. By default, PROC CORR uses pairwise deletion when observations contain missing values. PROC CORR includes all nonmissing pairs of values for each pair of variables in the statistical computations. Therefore, the correlation statistics might be based on different numbers of observations.
If you specify the NOMISS option, PROC CORR uses listwise deletion when a value of the VAR or WITH statement variable is missing. PROC CORR excludes all observations with missing values from the analysis. Therefore, the number of observations for each pair of variables is identical.
The PARTIAL statement always excludes the observations with missing values by automatically invoking the NOMISS option. With the NOMISS option, the data are processed more efficiently because fewer resources are needed. Also, the resulting correlation matrix is nonnegative definite.
In contrast, if the data set contains missing values for the analysis variables and the NOMISS option is not specified, the resulting correlation matrix might not be nonnegative definite. This leads to several statistical difficulties if you use the correlations as input to regression or other statistical procedures.
Pearson Correlation
Spearman Correlation
=lung spearman;
proc corr data
var age mealcal; run;
Kendall’s rank correlation
=lung kendall;
proc corr data
var age mealcal; run;