Merge a Microsoft Excel File with a SAS Data Set


In this video, you learn to merge a Microsoft
Excel file with a SAS data set. Let me show you how to use SAS/ACCESS
Interface to PC Files and Base SAS to merge a
Microsoft Excel file with a SAS Data Set. Here is the Excel workbook. There is a
worksheet, Address info. This worksheet
contains a named range address. The named range contains the identification
numbers, names, and addresses of students
who took the SAT. It looks like the Excel data is nicely arranged in
ascending order of IDNumber. The SAS data set testscores contains
identification numbers, SAT scores, and the
gender of students who took the SAT. I would like to create a new data set that has
the SAT scores and addresses in one row, or
observation. To do this, I’ll need to horizontally combine the
Excel worksheet with the SAS data set. In SAS,
this is called a merge. In SAS, we have to define libraries to work with
the SAS data. When we define a library, we tell SAS where the
data is located and what type of data it is. So I’ll need to begin with this LIBNAME
statement to access the SAS data.
libname is a keyword, libsas is the label that I’ll use to refer to the SAS library in the rest of the
code. We call it a libref, or an alias. The remainder of the statement tells SAS the
folder where the SAS data sets are located. I’m going to highlight the LIBNAME statement
and use the Run toolbar to submit the code for
execution. The log tells me the library has been assigned.
I like to view the data in SAS Explorer right
away to verify that I can see the data. Let me click on the Explorer tab. I’ll click on
Libraries, and I’ll see the libraries that have been
defined. Here is the libsas library I just defined. Let’s
double-click it. I see the testscores SAS data set as well as
other SAS data sets in this library. I’ll double-click testscores so I can view the
data. The data looks pretty good, but I notice
right away the testscores data set is not properly arranged by IDNumber. Before I can
perform a merge, I’ll have to make sure I sort the
data by IDNumber. I’ll address sorting a little later. For now, I’ll close
the table viewer by clicking X. Let’s move on and access the Excel workbook
in SAS. To access the Excel workbook, we must also define a library using a LIBNAME
statement. libnname again is a keyword. exldata is the label I’ll use to refer to the
workbook, which is used as a library. pcfiles is the engine option that reads Microsoft
Excel 32-bit data in a 64-bit SAS environment. pcfiles is the engine option that reads Microsoft
Excel 32-bit data in a 64-bit SAS environment. Depending on the type of data you are
accessing, SAS has many engine types that
you can use. For example, if Microsoft Excel and SAS have
the same bit version, you can use the EXCEL
engine option to read the Excel data. In this case, because we are connecting to an
Excel workbook, the remainder of the statement
tells SAS the location of the Excel workbook with the PATH=option. Let me highlight the
LIBNAME statement and execute it. I will view the Explorer again to verify that I can
see the data I just defined. I will go up one level
to Libraries. I see the exldata library has been added to the
list of active libraries. If I double-click it, I see the items that
correspond to the worksheet I showed you in
Excel. Address without the dollar sign is the named
range, and Address with the dollar sign is the
worksheet. I’ll double-click Address. A table viewer opens, and I am now viewing the
data that is being read directly from the name
range Address in the workbook. Let me close the table viewer by selecting X.
Now that I have access to the Excel file and the
SAS data set in the same SAS session, I can merge the Excel data with the SAS data set. I
will need to identify a key column that both files
have in common. I recall that both the Excel file and the
SAS data set have a unique identification
number. It is called IDNumber in both files. I’ll use this as the key column to match the rows
from both files. It’s also a requirement that both files be
arranged in sorted order by the key column;
IDNumber. Here is where I’ll need to sort the testscores
data set. So back to my program. Before I can proceed with the merge, I must
perform an intermediate step that rearranges the
testscores data set in order by IDNumber. The SORT procedure will do this for me! Here is
the SORT procedure code. PROC SORT starts
the procedure. The DATA=option specifies the data set I want
to sort because I don’t want to overwrite the
permanent data set, libsas.testscores. I’ll use the OUT=option to create a temporary
sorted copy of the data set. I’ll call it testscores
and store it in the work library. The BY statement specifies how to arrange the
data. I’ll arrange the sorted data in ascending
order by IDNumber. Ascending order is the default, so nothing else
is needed here. When I highlight and run this step it creates the
sorted data set. Here in the log, I verify that the
data set was created. The sorted copy of work.testscores contains 80
observations and three variables. Now, we are ready to merge the Excel file with
the sorted copy, work.testscores. Let’s take a
look at the program. The DATA statement names the SAS data set I
want to create. I’ll name the data set
SATScoreAddress and store it in the work library. The MERGE statement identifies the
Excel file and the SAS data set I want to merge.
In this case, I’ll use the name range Excel.Address and the sorted data set
work.testscores. The BY statement tells SAS to match the data
from the two files by IDNumber. I’ll highlight the program and use the RUN button
to submit the code. Let’s look at the log to
confirm the new data set work.SATScoreAddress has been created and it
combines data from the Microsoft Excel file and
the SAS data set. No errors or warnings, which is great.
Since the DATA step does not create a report,
I’ll go to the Explorer to verify the final results. I’ll select the Explorer tab, go up one level,
double-click the work library, and open the
SATScoreAddress data set. As you can see, we have the address
information that came from the Excel worksheet
and the test scores that came from the SAS data set. I’ll go ahead and close this by
selecting X. One last thing: when you are working with third-
party data, you should make it a habit to close
your connection to the data. Back in the editor, we’ll close the connection to
Excel by using this LIBNAME statement.
Specify the library, exldata, and the CLEAR option. I’ll go ahead and highlight this code and
submit. The CLEAR option disconnects SAS from the
exldata library—in this case, the Excel
workbook. In this demonstration you learned how you can
merge a Microsoft Excel file with a SAS data
set. Thank you for your interest in SAS!

5 thoughts on “Merge a Microsoft Excel File with a SAS Data Set

Leave a Reply

Your email address will not be published. Required fields are marked *