Montura Consulting   Research & Development
Components

SAS Institute advertises how The SAS System gives you "the power to know", and within a certain context that is true. There is one problematic issue. That power is limited to statisticians who use canned SAS procedures, data analysts who need no knowledge of computer programming, and c-level executives who run the company through graphs, reports and summarized data.

Repository Relationship Programming was specifically designed provide the power to control "the power to know". This technology was designed for immediate use with all SAS solutions; however, the entire specification can be applied to all Microsoft solutions. The first objective in object programming is to minimize or eliminate the horrid mess that IT managers call "your SAS program". SAS analytics are powerful, but "the power to know" is just a low-voltage battery unless there is some way to apply granular controls over standard and optional portions of the programmed process. Granular controls are impossible to implement unless SAS programmers apply some form of source code organization. Refactoring and code beautification is a complete waste of time without some kind of system level automation that reassembles so many blocks of code into the correct order for execution, without causing more code complexity.

RRP architecture was designed to force SAS programmers into better programming habits. Source code organization is easy because RRP has automation that reassembles and executes every source code used in the application, as if the entire application were a single source code. Refactoring does not increase or decrease code complexity. Granular controls that indicate enable/disable can be applied to every piece of source code at runtime, and each may be toggled on/off during execution based on row-level data.

Some RRP design patterns are mandatory, failure to follow the pattern may result in misleading error/alert information in the SAS Log.


What is SCL (SAS Component Language)

SCL is a scripting language that replaces SAS/Macro.

SCL is compiled. From a coding perspective, method and class statements result in SAS objects that look just like Java objects.

SCL is used as a wrapper for Base/SAS, Macro works the same way.

SCL operates one level above Base/SAS. Errors and crashes that halt the execution of Base/SAS and Macro code have no effect on the execution of the SCL object.

SAS objects are written in SCL.


Object Application Architecture

Object architecture is easy to understand because there are very few items to cover. Object applications are composed of five items - one is data while the other four are SCL code. The only corresponding item in Base/SAS is "Task Definition" which represents the average SAS program - data steps, procedures, and macro.

BONUS: RRP provides SAS coders with one major advantage over all other architecture. The architecture is the exactly the same for SCL apps that execute in batch mode and SCL apps that control SAS/Frame (GUI) behaviors. Every SAS object uses a single code skeleton for a baseline standard. The skeleton contains four methods, which are blocks of code (similar to a macro) that are executed from the runInterface method, which serves as the standardized name for executing all SAS objects. When you understand how the skeleton works, you understand how every SAS object works. When you know how SAS batch mode programs are assembled, executed, and debugged then you know 75% of what you need for SAS GUI programming.

SAS object applications are composed of five artifacts that describe "how the application works".

  1. Application Definition
  2. Application Instantiator
  3. Controller
  4. Task Definition
  5. Global Shared Property

Application Definition

RRP is designed to work on a collection of programs, where each repository is one collective.

  • Take a SAS application that has three %include statements. You chop that into physically separate source code to get rid of the %include statements and you are left with a series of SAS programs that execute in sequence.

RRP is executes each program from each repository in sequence, so you need to specify which is first, second, etc. The controller, covered as Item #3 contains a "built in" main driver that performs automatic error checking after each repository program completes, so you will not need to reinvent that wheel.

  • Create a SLIST or SAS dataset, order of appearance can be the default execution order.
  • Insert the 4-level qualifier for each SAS program (for objects only).
  • Insert the physical path and SAS program name (for macros only)

Example SLIST

The application definition may reference programs from any SAS catalog. This example contains programs that are specific to this application as well as programs that are considered "common use" that be used by many applications. Many object apps have 50+ entries. This collection is referred to as the Program Work List. Content defines the application.

SLIST(
      'common.setup.timestamp.class' {C}
      'montura.app1.xml.class' {C}
      'montura.app1.report.class' {C}
     )[3]
         

* The controller reads the list of SAS program in Application Definition and executes each in turn. Application Definition is the way object oriented programs are "included", without using the include statement.

* You should be able to read the definition list and know every SAS program that is executed from first to last.

* Nested includes eliminate the programmers ability to search and debug source code.

* Nested includes eliminate the ability to automatically detect when required source code is absent.

Application Instantiator

All object applications use a system virtual machine that translates object code into executable code. The virtual machine in SAS is part of the SAS Foundation, so unlike Java - no download required.

init: 
    dcl object appID=_new_ controller.class();
    appID.runInterface(); 
    appID._term(); 
return; 

The first line invokes the VM within SAS Foundation, which loads an object application named "controller" from the same SAS catalog into a memory-resident state for execution. Optionally, you may use a four-level qualifier for an application in a different catalog.

The second line begins execution using the standard method name "runInterface".

The third line shows how object applications are terminated and removed from memory.


Controller

The main driver for the application can be written in as few as two steps. Functionality is so boilerplate that a single controller can be used by almost every SAS object application that you write.

  • Instantiate each object in each repository.
  • Execute each instance.

Object Instantiation is simple - apply the SAS instance and load functions to each SAS object identified in Application Definition .

Instance Execution is also simple. Iterate from 1 to N over each instance and execute using the standard main driver method "runInterface". The following block of code illustrates the code view. Notice there is no room for if-then logic in the controller. This program has single-minded purpose - execute every program in the definition as fast as possible. There is no room here to identify any program by name. There is no way the controller can be programmed with logic that will conditionally execute or skip any program for any reason.

  • Every program has its runInterface method executed as a standard practice
  • The best way to remove adjust from this process is to add or remove the object name from the application definition, use multiple application definitions for different purposes, or dynamically add/remove application definition programs based on specific criteria before the Instantiation step. Storing object names in a SAS dataset allows SAS programmers the most flexibility when it comes to on-the-fly adjustments.
    runInterface: method;                            
dcl num xItem;
dcl object sasObject;

do xItem=1 to listlen(application);
sasObject=getitemo(application, xItem);
sasObject.runInterface();
end;
endmethod;

Task Definition

The first step in writing a SAS object is to standardize the way each object self-determines its own execute status, where self-determination is based on a combination of local and global flags. The following code illustrates how to code a single validation on a SAS dataset.

runInterface

  • Check the global error flag, if there is a variable named "stop", then do not execute any other method in this program.
  • Check the local error flag after each METHOD is executed, if there is a variable named "stop", do not execute any more methods in this program.

interface1

  • This validation is only to be executed if a specific SAS dataset is found.
  • Set local error flag on the "not found" condition.

interface2

  • Use SQL with certain criteria to create a SAS dataset.
  • If any SQL error is detected, set the global error flag to stop the application.

interface3

  • if the validation dataset contains any data then format a readable error message for QA team or end user.

interface4

  • Remove validation dataset on zero-data condition, avoid clogging up the WORK library. A single application may contain dozens of validations and empty datasets are a distraction during the debug phase.

 

class validate1;                                                                   
public list gControl / (sendEvent='N');
public list gError / (sendEvent='N');
public list gGlobal / (sendEvent='N');
public list gData / (sendEvent='N');

public list iControl / (sendEvent='N');
public list interface / (initialValue={
'interface1',
'interface2',
'interface3',
'interface4'
});

runInterface: method;
dcl num xMethod;
do xMethod=1 to listlen(interface) while (nameditem(gControl, 'stop')=0 and
nameditem(iControl, 'stop')=0);
call send(_self_, getitemc(interface, xMethod));
end;
endmethod;

interface1: method;
if exist('work.xmldata')=0 then
setnitemc(iControl, 'Normal stop, optional SAS dataset not found', 'stop');
endmethod;

interface2: method;
dcl num limit_date=getnitemn(gData, 'limit_date');

control asis;
submit continue;
proc sql;
create table limitDateError as
select *
from work.xmldata1
where start_date LT &limit_date and
end_date LT &limit_date;
quit;
endsubmit;

if symgetn('sqlrc') then do;
insertc(gControl, description||' '||_method_, -1, 'stop');
insertc(gControl, 'SQL syntax', -1, 'error');
end;
endmethod;

interface3: method;
dcl num dset;
dcl char cDate1 cDate2;

dset=open('limitDateError', 'i');
do while(fetch(dset)=0);
cDate1=put(getvarn(dset, varnum(dset, 'start_date')), mmddyy10.);
cDate2=put(getvarn(dset, varnum(dset, 'end_date')), mmddyy10.);

insertc(gError, 'LIMIT_DATE', -1, 'error_id');
insertc(gError, 'Historical dates not allowed in forecast', -1, 'message');
insertc(gError, 'Limit Start:'||cDate1||' Limit End: '||cDate2, -1, 'detail');
end;
close(dset);
endmethod;

interface4: method; if listlen(gError)=0 then
delete('limitDateError');
endmethod;
endclass;

Global Shared Property

Repository applications allow for constant changes to the application definition. Passing paremeters from program #6 to program #7 will is something that does not work because there is no way to guarantee that program #7 will be program #7 tomorrow. Program #7 might be deleted from the application, moved to another catalog, name changed, or moved down farther in the definition to position #27.

The old problem here is control over program execution. If the currently executing program is program #6 how do you invoke program #7 and know that program #7 has received the correct number of parameters? How does program #7 know that each parameter is numeric, character, or missing value?

The solution is simple. Create a global data vector where vriables are used pretty much the same way we use macro variables today, the only difference is the functions that are used for getting and setting the values. Instead of using data step functions symget() and symput() we use SCL functions getnitemc() and setnitemc().

The solution is applied in two easy steps.

  • First step is to load the global data vector with one or more parameters. Source code from interface3 (above) shows how the INSERTC function is used to insert three character variables into a global property named GERROR.
            insertc(gError, 'LIMIT_DATE', -1, 'error_id');                          
insertc(gError, 'Historical dates not allowed in forecast', -1, 'message');
insertc(gError, 'Limit Start:'||cDate1||' Limit End: '||cDate2, -1, 'detail');
  • The second step is to execute the receiving program. Use SCL function getnitemc to reads three character variables from the global property GERROR. You simply don't need pass parameters between programs any more. Just add/remove/change data values in the global property. Think of this as just another way to make SAS programming EASY on yourself.
    interface1: method;                                
dcl char error_id;
dcl char message;
dcl char detail;

if nameditem(gError, 'error_id') then
error_id=getnitemc(gError, 'error_id');

if nameditem(gError, 'message') then
message=getnitemc(gError, 'message');

if nameditem(gError, 'detail') then
detail=getnitemc(gError, 'detail');

submit continue; proc sql;
insert into errorLog (
error_id,
message,
detail
)
values (
"&error_id",
"&message",
"&detail"
);
quit;
endsubmit;

if symgetn('sqlrc') then do;
insertc(gControl, description||' '||_method_, -1, 'stop');
put 'SQL ERROR occured in program:' description ' Step:' _method_;
end;

clearlist(gError);
endmethod;

COPYRIGHT © 1989 - 2011 Montura, Inc.
All rights reserved. This material may not be published, broadcast, rewritten or redistributed.
All material on this website is drawn directly from US Patent Repository Relationship Programming
7,984,422
Reading any page on this website is the same as reading patent US 7,984,422 |
Call 510-798-8367 to obtain you license for use today.
Violators will be prosecuted and perhaps persecuted with undesirable press release news as well..

Terms & Conditions -- Privacy Policy