Thursday, 30 June 2016

Some of the top and the excellent reasons to use PROC SQL : Epoch Research Institute India Pvt. Ltd. (www.epoch.co.in)

Some of the top and the excellent reasons to use PROC SQL : EpochResearch Institute India Pvt Ltd. (SASAuthorised Training : Ahmedabad | Bangalore)

SAS_Advanced_programming

PROC SQL is a very dominant tool. It can make your life much simpler and easier. I have seen SAS users divided among two groups. One of the groups loves PROC SQL while the other group hates PROC SQL. But if it comes to me, I would personally let you know that I fell in love with PROC SQL in earlier time during my learning periods. This is just because I have found that lot of tasks can be easily done in PROC SQL. Sometimes it even happens they are not even possible with other SAS procedures. As per my view, I have seen that there are various different advantages of PROC SQL in making SAS programming and coding task simpler and more spontaneous. Now these are  top reasons for using PROC SQL as shown below
  •     Fuzzy merge
The process of matching the records where the condition of a match is based on near but not the equal condition is known as Fuzzy merge. For e.g. in survival analysis, we need to know whether a member is dead so for the same we need to match our sample to the death records in death tape. We compare the variables that are matching and assign points to them. The score of 16 would be a great match. But when it comes to real life, it is not true. So our rule will be that the score of 13 and above will be considered a match. We will do a manual checking for any score between 9 and 12 whether it is a match.
PROC SQL;
CREATE TABLE REVIEW AS
SELECT *
FROM SAMPLE, OW_DEATH
WHERE SUM( ((KBMON =SBMON)*2), ((KBDAY =SBDAY)*1),
((KBYEAR =SBYEAR)*2),
((KFSNDX =SFSNDX)*1), ((KFNAME =SFNAME)*1),
((KMNAME =SMNAME)*1),
((KLSNDX =SLSNDX)*1), ((KLNAME =SLNAME)*1),
((KSEX =SSEX)*2) , ((KSSN =SSSN and KSSN ne ' ')*4) )
>=9;
QUIT;  

  •    Summarize the data
SQL functions can be used to summarize data. PROC SQL is more spontaneous than PROC MEANS or PROC SUMMARY. SAS creates and output table containing more rows and columns that are required and you have to choose right_TYPE_value.

  •    COALESCE function
PROC SQL makes the assortment process very easy and simpler where the COALESCE function will pick the first non-missing value.

  •   Insert record to table
I have mentioned the code below to estimate the mid-year membership reckoning from eligibility file for year from 1986 – 2002 (Note that this is just a demonstration purpose, the code is not efficient). Here there is a macro %DO-%END loop where each iteration will produce two macro variables and PROC SQL will insert a new record into the table. If we compared SQL to Base SAS processing, SQL saves one step.

%MACRO MULTI_YR (BY=, EY= );
*---creating empty table---;
DATA MYEARPOP; MYEAR=.; POP=.; DELETE; RUN;
%DO I= &BY %TO &EY;
PROC SQL NOPRINT;
SELECT COUNT(*) INTO :RECORDS
FROM CCPSSD.KPOPGAP2
WHERE FDATE LE "01jul&I"D LE TDATE;
;QUIT; %put &records;
PROC SQL; INSERT INTO MYEARPOP
SET MYEAR=&I,POP=&RECORDS
;QUIT;
%END;
%MEND MULTI_YR;
%MULTI_YR(BY=1986,EY=2002 );

  •    Matching several tables at different levels
Here I have joined three tables. Here the task to be performed is to get inpatient diagnoses existing in OO.ADT_DIAG for sample table. The middle table serves as a link for the other two tables as it contains both the matching variables.
PROC SQL;
CREATE TABLE ADT AS
SELECT A.HRN, DIAG
FROM SAMPLE AS A , OO.ADT_REG (dbkey=HRN dbnullkeys=no) AS B,
OO.ADT_DIAG (dbkey=MAIN_KEY dbnullkeys=no) AS C
WHERE A.HRN = B.HRN and B.MAIN_KEY=C.MAIN_KEY
;QUIT;

  •   Count frequencies
PROC SQL makes it an ease to count the non-missing values for number of variables and give an output on one line.
PROC SQL; SELECT COUNT(*) AS TOTAL,
COUNT(DIAG0001) AS DX1,
COUNT(DIAG0002) AS DX2,
COUNT(DIAG0003) AS DX3,
COUNT(DIAG0004) AS DX4
FROM INP; QUIT;
TOTAL DX1 DX2 DX3 DX4
--------------------------------------------------
1562 1562 1421 1163 814

  •    Textwrapping
You have a long character variable. Now you want to print the values using PROC PRINT. You will get a warning message such like: “Data too long for column “COMMENT” truncated to 124 characters to fit”. Use PROC SQL with flow option To avoid such an error message.

PROC SQL FLOW=30;
SELECT HRN, COMMENT
FROM A
;QUIT;
HRN COMMENT
--------------------------------------
12345678 LONGTEXTTTTTTTTTTTTTTTTTTTTT
TTTTTTTTTTTTTTTTTTTTTTTTTTTT
TTTTTTTTTTTTTTTTTTTTTTTTTTTT
TTTTTTTTTTTTTTTTTTTTTTTTTTTT
87654321 LONGTEXTTTTTTTTTTTTTTTTTTTTT
TTTTTTTTTTTTTTTTTTTTTTTTTTTT
TTTTTTTTTTTTTTTTTTTTTTTTTTTT
TTTTTTTTTTTTTTTTTTTTTTTTTTTT

My conclusion: PROC SQL is code-saving but it is not always time-saving. You can share your views if you have any.

Epoch Research Institute Links:

Email us: info@epoch.co.in

SAS Training & Placement Programs with Internship: Epoch Research Institute India Largest and Oldest #SASTraining Institute (#epochsastraining)

EPOCH RESEARCH INSTITUTE OFFERS:
Authorized SAS TRAINING | SAS CERTIFICATION | SOFTWARE PURCHASE | BUINESS CONSULTING | TECHNICAL SUPPORT ON SAS || SAS STAFFING SOLUTION 

Label:
#SASELEARNING,#SASELEARNING,#SASONLINETRAINING,
#SASONLINETRAININGFORBEGINNERS,#LEARNSASPROGRAMMINGONLINE,
#SASCLINICALONLINETRAINING,#SASBASEONLINETRAINING
#BIGDATASASTRAININGEPOCH,#SASBIGDATATRAINING #EPOCHRESEARCHINSTITUTE, #SASTRAINING, EPOCH SAS FEEDBACK,
#CLINICALSASPROGRAMMING, #CLINICALSAS, #CLINICALSASPROGRAMMER

No comments:

Post a Comment

How to code in Python with SAS 9.4 by Epoch Research Institute

The SAS® platform is now open to be accessed from open-source clients such as Python, Lua, Java, the R language, and REST APIs to leverage...