`10, 20, 30, 40, 150, 160, 170, 180, 190, 200`

# Deriving Quantiles or Percentiles in R vs SAS

### Data

The following data will be used show the differences between the default percentile definitions used by SAS and R:

### SAS Code

Assuming the data above is stored in the variable `aval`

within the dataset `adlb`

, the 25th and 40th percentiles could be calculated using the following code.

```
=adlb;
proc univariate data
var aval;=stats pctlpts=25 40 pctlpre=p;
output out run;
```

This procedure creates the dataset `stats`

containing the variables `p25`

and `p40`

.

The procedure has the option `PCTLDEF`

which allows for five different percentile definitions to be used. The default is `PCTLDEF=5`

.

### R code

The 25th and 40th percentiles of `aval`

can be calculated using the `quantile`

function.

`quantile(adlb$aval, probs = c(0.25, 0.4))`

This gives the following output.

```
25% 40%
32.5 106.0
```

The function has the argument `type`

which allows for nine different percentile definitions to be used. The default is `type = 7`

.

### Comparison

The default percentile definition used by the UNIVARIATE procedure in SAS finds the 25th and 40th percentiles to be 30 and 95. The default definition used by R finds these percentiles to be 32.5 and 106.

It is possible to get the quantile function in R to use the same definition as the default used in SAS, by specifying `type=2`

.

`quantile(adlb$aval, probs = c(0.25, 0.4), type=2)`

This gives the following output.

```
25% 40%
30 95
```

It is not possible to get the UNIVARIATE procedure in SAS to use the same definition as the default used in R.

Rick Wicklin provided a blog post showing how SAS has built in support for calculations using 5 of the 9 percentile definitions available in R, and also demonstrated how you can use a SAS/IML function to calculate percentiles using the other 4 definitions.

More information about quantile derivation can be found in the SAS blog.

### Key references:

Compare the default definitions for sample quantiles in SAS, R, and Python