Search
 
 

Display results as :
 


Rechercher Advanced Search

Keywords

Affiliates
free forum


Top 10 tips and tricks about PROC SQL

View previous topic View next topic Go down

Top 10 tips and tricks about PROC SQL

Post  pallav on Wed Jun 27, 2012 3:45 pm



Interestingly, I just found that the most searched keyword is PROC SQL, through the traffic analysis of my tiny blog. The reason possibly is: nowadays everybody knows SQL, more or less; then someone can do some parts of the SAS job by PROC SQL without using any procedure or DATA step.

As a dialect of SQL, PROC SQL can really play a lot of tricks in SAS, which are mostly carried out by DATA step or PROCs. Here I summarize the 10 interesting ones among them. Let's start with the free SAS help dataset SASHELP.CLASS. This dataset contains the weight, height, sex and other information of 19 fake teenagers. In this demo, I primarily paly with the WEIGHT variable from it.

http://saslearn.blogspot.in/2012/06/top-10-tips-and-tricks-about-proc-sql.html


data class;
set sashelp.class;
obs = _n_;
run;

1. Calculate the median of a variable

This is the task usually done by PROC MEANS. With the aggregating HAVING clause, PROC SQL can fulfill this purpose as well.
proc sql;
select avg(weight)
from (select e.weight
from class e, class d
group by e.weight
having sum(case when e.weight = d.weight then 1 else 0 end)
ge abs(sum(sign(e.weight - d.weight))));
quit;


http://saslearn.blogspot.in/2012/06/top-10-tips-and-tricks-about-proc-sql.html


2. Draw a horizontal bar chart
It is a substitute in case that we want to have a look at the distribution of a variable but don’t know PROC GPLOT or other plotting procedures.
proc sql;
select age, '|',
repeat('*',count(*)*4) as frequency
from class
group by age
order by age;
quit;

http://saslearn.blogspot.in/2012/06/top-10-tips-and-tricks-about-proc-sql.html


3. Return the running total for a variable

DATA step and PROC IML both can get this job done, like what I have shown at a previous post. PROC SQL is the 3rd choice, just with the little help of a subquery at the SELECT clause.
proc sql;
select name, weight,
(select sum(a.weight) from class as a where a.obs <= b.obs) as running_total
from class as b;
quit;


http://saslearn.blogspot.in/2012/06/top-10-tips-and-tricks-about-proc-sql.html


4. Report data with subtotal

The first thing in my mind is the powerful REPORT procedure. However, PROC SQL is an alternative with the set operator.
proc sql;
select name, weight
from class
union all
select 'Total', sum(weight)
from class;
quit;

5. Find the column information from metadata

SAS stores the metadata at its DICTIONARY datasets. PROC SQL can visit the column information easily, without using the CONTENTS procedure.
proc sql;
select name, type, varnum
from sashelp.vcolumn
where libname = 'WORK' and memname = 'CLASS';
quit;


http://saslearn.blogspot.in/2012/06/top-10-tips-and-tricks-about-proc-sql.html


6. Rank a variable

PROC RANK is a really handy tool for this functionality. Besides it, PROC SQL can do some simple ranking as well.
proc sql;
select name, a.weight, (select count(distinct b.weight)
from class b
where b.weight <= a.weight) as rank
from class a;
quit;

http://saslearn.blogspot.in/2012/06/top-10-tips-and-tricks-about-proc-sql.html



7. Random sampling

PROC SURVEYSELECT involves with too much statistics. If we just need a simple random sampling, PROC SQL is a more popular option. For example, if I want to pick out 8 observations, I can use PROC SQL like:
proc sql outobs = 8;
select *
from class
order by ranuni(1234);
quit;

8. Replicate a data set without data

In PROC SQL, it is a fairly simple one-sentence statement to create an empty data set. Not sure DATA step can do it.DATA step also has a way to do it.
proc sql;
create table class2 like class;
quit;

http://saslearn.blogspot.in/2012/06/top-10-tips-and-tricks-about-proc-sql.html


9. Transpose data
DATA step ARRAY is a big headache for most people. Suppose that we want to list the names of the teenagers by their genders, PROC SQL has a way to do it, although it may require some complicated queries/subqueries.
proc sql;
select max(case when sex='F'
then name else ' ' end) as Female,
max(case when sex='M'
then name else ' ' end) as Male
from (select e.sex,
e.name,
(select count(*) from class d
where e.sex=d.sex and e.obs < d.obs) as level
from class e)
group by level;
quit;


http://saslearn.blogspot.in/2012/06/top-10-tips-and-tricks-about-proc-sql.html


10. Count the missing values


The great thing in PROC SQL is that the NMISS and N functions under it work for both numeric and character variables. We can also write a macro based on them to search the missing values for all variables.
proc sql
select count(*), nmiss(weight), n(weight)
from class;
quit;


http://saslearn.blogspot.in/2012/06/top-10-tips-and-tricks-about-proc-sql.html

pallav

Posts : 98
Join date : 2012-03-14
Location : Ahmedabad

Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum