Tuesday, 12 May 2015

Direct Update DSO

Scenario for Using DataStore Objects for Direct Update

The following graphic shows a typical operational scenario for for DataStore Objects for direct update:

This graphic is explained in the accompanying text

DataStore objects for direct update ensure that the data is available quickly. The data from this kind of DataStore object is accessed transactionally. The data is written to the DataStore object (possibly by several users at the same time) and reread as soon as possible.

It is not a replacement for the standard DataStore object. It is an additional function that can be used in special application contexts.

The DataStore object for direct update consists of a table for active data only. It retrieves its data from external systems via fill or delete APIs. See DataStore Data and External Applications.

Standard DSO differs from Direct update DSO in terms of how data is processed.
In Standard DSO, data is stored in different versions Active, Delta, Modified
whereas Direct Update DSO contains data in single version. Data is stored precisely in same form in which it is written to DSO consists of Active Data Table for direct update by the application. It retrieves its data from external systems via fill or delete APIs.

The Following APIs exists:

RSDRI_ODSO_INSERT: This API is used to Insert new data which does not exist in the system.
RSDRI_ODSO_INSERT_RFC: This is similar to the API “RSDRI_ODSO_INSERT” except that this API can be called remotely.
RSDRI_ODSO_MODIFY: This API is used to modify existing records. If a new record comes in, the record is inserted.
RSDRI_ODSO_MODIFY_RFC: This is similar to the API “RSDRI_ODSO_MODIFY” except that this API can be called remotely.
RSDRI_ODSO_UPDATE: This API is used to modify existing records.
RSDRI_ODSO_UPDATE_RFC: This is similar to the API “RSDRI_ODSO_UPDATE” except that this API can be called remotely.
RSDRI_ODSO_DELETE_RFC: This API is used to delete records.

Direct Update DSO – Properties

DSO for direct update contains data in a single version. Therefore, data is stored is in precisely the same form in which it was written by the application
Records with the same key are not aggregated
SIDs cannot be generated
Data from it can be updated to additional info providers.
The DSO for direct update is available as an Info Provider in BEx Query Designer and can be used for analysis purposes.
It is an additional function that can be used in special application contexts.
To load data quickly without using the extraction and load processes in the BI system.
Can be used as data target for an analysis process.

Direct Update DSO – Functionality

Can Update the Direct Update DSO data in to additional info providers.
Can be used as an info provider for Bex Query Designer and can be used for analysis process.

Advantage:

It is structured in a way to access data easily.
Data is made available for analysis and reporting immediately after it is loaded.

Drawbacks

The loading process is not supported by the BI system (Data Sources do not provide the data).
Therefore DSOs are not displayed in the administration or in the monitor.
Since a change log is not generated, we cannot perform a delta update to the Info Providers.

Creation Of Direct Update DSO:

Step 1)

Go to transaction code RSA1
Click the OK button.

Step 2)

Navigate to Modelling tab->Info Provider.
Right click on Info Area.
Click on “Create Data Store Object” from the context menu.

Step 3)

Enter the Technical Name.
Enter the Description.
Click on the “Create” button.

Step 4)
Click on the Edit button of “Type of DataStore Object”.

Step 5)
Choose the Type “Direct Update”.

Step 6)
Activate the DSO.

Write Optimized DSO

Overview

The concept of Write Optimized DSO was introduced in BI-7.0. Write Optimized DSO unlike Standard DSO has only one relational table, i.e. Active Table and moreover there is no SID generated in Write Optimized DSO, and hence loading the data from Data Source to Write Optimized DSO takes less time and acquires less disk space.

Business Case

Require Data storage for storing detailed level of data with immediate reporting or further update facility. No over write functionality required.

Limitation of Standard DSO

A standard DSO allows you to store detailed level of information; however activation process is mandatory.
Reporting or further update is not possible until activation is completed.

Write Optimized DSO - Properties

Primarily designed for initial staging of source system data
Business rules are only applied when the data is updated to additional InfoProviders.
Stored in at most granular form
Can be used for faster upload
Records with the same key are not aggregated ,But inserted as new record, as every record has new technical key
Data is available in active version immediately for further Processing
There is no change log table and activation queue in it.
Data is saved quickly.
Data is stored in it at request level, same as in PSA table.
Every record has a new technical key, only inserts.
It allows parallel load, which saves time for data loading.
It can be included in a Process Chain, and we do not need an activation step for it.
It supports archiving.

Write-Optimized DSO - Semantic Keys

Semantic Key identifies error in incoming records or Duplicate records.
Semantic Keys protects Data Quality such that all subsequent Records with same key are written into error stack along with incorrect Data Records.
To process the error records or duplicate records, Semantic Group is defined in DTP.
Note: if we are sure there are no incoming duplicate or error records, Semantic Groups need not be defined.

Write Optimized DSO - Data Flow

1.     Construct Data Flow model.
2.     Create Data source
3.     Create Transformation
4.     Create Info Package
5.     Create DTP

Write-Optimized - Settings

If we do not check the check box "Do not Check Uniqueness of Data", the data coming from source is checked for duplication; i.e. if the same record (semantic keys) already exist in the DSO, then the current load is terminated.
If we select the check box, duplicate records are loaded as a new record; there is no relevance of semantic keys in this case.

When Write Optimized DSO is Recommended?

For faster data loads, DSOs can be configured to be Write optimized
When the access to source system is for a small duration.
It can be used as a first staging layer.
In cases where delta in DataSource is not enabled, we first load data into Write Optimized DSO and then delta load can be done to Standard DSO.
When we need to load large volume of data into Info Providers, then WO DSO helps in executing complex transformations.
Write Optimized DSO can be used to fetch history at request level, instead of going to PSA archive.

Functionality

It contains only one table, i.e. active data table (DSO key: request ID, Packet No, and Record No)
It does not have any change log table or activation queue.
Every record in Write Optimized DSO has a new technical key, and delta in it works record wise.
In Write Optimized DSO data is stored at request level like in PSA table.
In Write Optimized DSO SID is not generated.
In Write Optimized DSO Reporting is possible but it is not a good practice as it will effect the performance of DSO.
In Write Optimized DSO BEx Reporting is switched off.
Write Optimized DSO can be included in InfoSet or Multiprovider.
Due to Write Optimized DSO performance is better during data load as there is no Activation step involved. The system generates a unique technical key
The technical key in Write Optimized DSO consists of the Request GUID field (0REQUEST), the Data Package field (0DATAPAKID) and the Data Record Number field (0RECORD).

Points to Remember

Generally Write Optimized DSO is not preferred for reporting, but If we want to use it for reporting then it is recommended to define a semantic in order to ensure the uniqueness of the data.
Write-optimized DSOs can force a check of the semantic key for uniqueness when data is stored.
If this option is active and if duplicate records are loaded with regard to semantic key, these are logged in the error stack of the Data Transfer Protocol (DTP) for further evaluation.
If we need to use error stack in our flow then we need to define the semantic key in the DSO level.
Semantic group definition is necessary to do parallel loads.

Reporting

If we want to use write-optimized DataStore object in BEx queries (not preferred), it is recommended to:
1. have a semantic key and
2. ensure that the data is unique.
Here the Technical key is not visible for reporting, so it looks like any regular DSO

Use

Data that is loaded into write-optimized DataStore objects is available immediately for further processing.

They can be used in the following scenarios:

● You use a write-optimized DataStore object as a temporary storage area for large sets of data if you are executing complex transformations for this data before it is written to the DataStore object. The data can then be updated to further (smaller) InfoProviders. You only have to create the complex transformations once for all data.

● You use write-optimized DataStore objects as the EDW layer for saving data. Business rules are only applied when the data is updated to additional InfoProviders.

The system does not generate SIDs for write-optimized DataStore objects and you do not need to activate them. This means that you can save and further process data quickly. Reporting is possible on the basis of these DataStore objects. However, we recommend that you use them as a consolidation layer, and update the data to additional InfoProviders, standard DataStore objects, or InfoCubes.

Structure

Since the write-optimized DataStore object only consists of the table of active data, you do not have to activate the data, as is necessary with the standard DataStore object. This means that you can process data more quickly.

The loaded data is not aggregated; the history of the data is retained. If two data records with the same logical key are extracted from the source, both records are saved in the DataStore object. The record mode responsible for aggregation remains, however, so that the aggregation of data can take place later in standard DataStore objects.

The system generates a unique technical key for the write-optimized DataStore object. The standard key fields are not necessary with this type of DataStore object. If there are standard key fields anyway, they are called semantic keys so that they can be distinguished from the technical keys. The technical key consists of the Request GUID field (0REQUEST), the Data Package field (0DATAPAKID) and the Data Record Number field (0RECORD). Only new data records are loaded to this key.

You can specify that you do not want to run a check to ensure that the data is unique. If you do not check the uniqueness of the data, the DataStore object table may contain several records with the same key. If you do not set this indicator, and you do check the uniqueness of the data, the system generates a unique index in the semantic key of the InfoObject. This index has the technical name "KEY". Since write-optimized DataStore objects do not have a change log, the system does not create delta (in the sense of a before image and an after image). When you update data into the connected InfoProviders, the system only updates the requests that have not yet been posted.

Use in BEx Queries

For performance reasons, SID values are not created for the characteristics that are loaded. The data is still available for BEx queries. However, in comparison to standard DataStore objects, you can expect slightly worse performance because the SID values have to be created during reporting.

If you want to use write-optimized DataStore objects in BEx queries, we recommend that they have a semantic key and that you run a check to ensure that the data is unique. In this case, the write-optimized DataStore object behaves like a standard DataStore object. If the DataStore object does not have these properties, you may experience unexpected results when the data is aggregated in the query.

********************************************************************************

Write Optimized DSO is used when a Data storage object is required for storing lowest granularity records such as address and when overwrites functionality is not needed. It consists of the table of active data only, hence no need for data activation which increases data process. Data store object is available immediately for further processing; it is used as a temporary storage area for large set of data.
Write-Optimized DSO has been primarily designed to be the initial staging of the source system data from where the data could be transferred to the Standard DSO or the Info Cube.

PSA receives data unchanged to the Source system
Data is posted at document level, After loading in to standard DSOs data is deleted
Data is posted to Corporate memory write –optimized DSO from pass thru write-optimized DSO
Data is Distributed from write-optimized “pass thru” to Standard DSOs as per business requirement.

Write Optimized DSO Properties:

It is used for initial staging of source system data.
Data stored is of lowest granularity.
Data loads can be faster since it does not have the separate activation step.
Every record has a technical key and hence aggregation of records is not possible. New records are inserted every time.

Creation Of Write-Optimized DSO:
Step 1)

Go to transaction code RSA1
Click the OK button.

Step 2)

Navigate to Modelling tab->Info Provider.
Right click on Info Area.
Click on “Create Data Store Object” from the context menu.

Step 3)

Enter the Technical Name.
Enter the Description.
Click on the “Create” button.

Step 4)
Click on the Edit button of “Type of DataStore Object”.

Step 5)
Choose the Type “Write-Optimized”.

Technical keys include Request ID, Data package, Record number. No additional objects can be included under this.
Semantic keys are similar to key fields, however, here the uniqueness is not considered for over write functionality. They are instead used in conjunction with setting “Do not check uniqueness of data”.
The Purpose of Semantic Key is to identify error in incoming records or Duplicate records .
Duplicate Records are written into error stack in the subsequent order. These records in the error stack can be handled or re-loaded by defining Semantic Group in DTP.
Semantic Groups need not be defined if there will be no possibility of duplicate records or error records.

If we do not check the Check Box “Allow Duplicate Data Record “, the data coming from source is checked for duplication, i.e, if the same record (semantic keys) already exist in the DSO, then the current load is terminated.
If we select the check box , Duplicate records are loaded as a new record. There is no relevance of semantic keys in this case.

Step 6)
Activate the DSO.

Tuesday, 28 April 2015

DSO - Step by Step : Creation, Extraction, Transformation

DataStore Object (DSO) is used to store data at the most granular form, i.e. at document level. This data can be used for reporting as well as for passing on to other data targets like InfoCubes. Steps described in this document applies to the most commonly used type - Standard DSO. (These steps also apply to Write-Optimized DSOs, except the data activation step)

This document is divided into 2 parts due to the length of the content.

Part 1 - Creation, Extraction and Transformation
Part 2 - Loading, Activation and Maintenance.

1. Creation & Configuration

Load Data Warehouse Workbench (DWW) using transaction code RSA1

Under Modeling section select InfoProvider. This will list all the InfoProviders in the right side of the screen grouped by InfoAreas.

Right-Click the InfoArea where the DSO needs to be created and select Create DataStore Object option

In the pop-up window, provide a Name (not more than 8 chars) and Description. You can see the ObjectType set to DataStore Object. (You have the option to change this to other InfoProvider type. Be careful not to change the type). If you need to use a structure similar to an existing DSO, you can enter the Technical Name of that DSO in the Template field. After filling in the required field click the Create icon.

In the Edit DataStore Object screen, under Settings section you can specify various configuration settings. Click on the Edit Button () to open a pop-up window where you can specify DSO type and Create SIDs option. Check the Set Quality Status to 'OK' option as this is required to activate the DSO data and make it available for reporting and loading it to other data targets.

Under Key fields add the fields that uniquely identify a record, and under Data fields add the fields that carry information for this DSO. You can do so by right-clicking Key fields/Data fields and select InfoObject Direct Input option as shown below.

In the Insert InfoObjects pop-up, type in the required list of InfoObjects for both Key fields and Data fields.

Once all the required fields are added, Save(Ctrl+S), Check(Ctrl+F2 or ) and Activate(Ctrl+F3 or ) the DSO. Once successfully activated, the status changes to Saved and Active and the final screen will be as shown below. The DSO is now ready for data to be loaded into it.

2. Extraction

2.1 Create a DataSource - DataSource is the staging area where a copy of the data from the source system(Flat file/SAP ERP/other DB) is stored in a flat table. It is also referred as persistent staging area, shortly PSA. Below are the steps to create and configure a DataSource.

(To keep it simple and focus on ETL process, a .csv file is used as data source. Normally SAP ERP System will be the data source.)

In DWW > Modelling > DataSources, select the Source System type to a Flat file system as shown below.

List of DataSources will be displayed in the right pane grouped under Application Components. Right Click the Application Component where the DataSource needs to be created and select Create Data Source... option

In the Create DataSource pop-up, provide the DataSource name, Source System name and the type of Data. Since the data to be loaded is transaction data we select Transaction Data as DataSource Data Type. After filling in the required details, click the Create button ().

Change DataSource screen opens up. In the General Info. tab fill in the short, medium and long description for the DataSource. Leave other data unchanged.

In the Extraction tab, specify the .csv file path from which the data is to be extracted. Also specify other details like Header Rows to be Ignored and Data Format accordingly as shown below. Leave the Delta Process options with the default value. Since source system is an .csv file, delta load is not possible, but only full upload.

In Proposal tab, click the Load Example Data button. This will read/parse the .csv file and will propose the column name and corresponding length and data type for each column. This is just a proposal by the BW system which may not be 100% correct.

The Fields tab is where the structure of the PSA/DataSource is finalized. The proposed metadata appears here and it can be edited to suit the needs. You can enter the InfoObject name in the InfoObjects column to copy the defaults from that InfoObject's metadata definition. This will ensure the correct technical specifications for the fields proposed.

The Preview tab allows you to see the data in PSA/DataSource format before actually loading the data into BW system. Click on the Read Preview Data button to preview the data.

2.2 Create an InfoPackage - InfoPackage is a scheduler object in BW system, that extracts data from the source system and saves the copy in to PSA/DataSource. Below are the steps to create, configure and execute an InfoPackage.

In DWW > Modeling > DataSources, locate the DataSource you have created. Right click the DataSource and select Create infoPackage... option

In the Create InfoPackage pop-up, provide the InfoPackage Description.

Maintain InfoPackage screen opens up. In the Data Selection tab, you'll see the list of fields along with filtering criteria used while extracting data from source system. Since we haven't configured any field in DataSource for filtering, the below screen lists none.

Extraction tab has the data auto-populated from the Extraction tab in DataSource definition. This lists the source file name, file type and format. The auto-populated data can be edited if required.

Processing tab specifies how the extraction is processed by BW system and where it is stored. As shown below, select Only PSA. This enable to edit the data in PSA before it is loaded into a data target.

Update tab has options to specify whether the update is a delta update or a full update. Since a flat file is used for this illustration, it is not delta capable and Full Update is the only listed option as shown below.

Schedule tab allows you to specify when to start the execution of extraction, either immediately or at a scheduled time. To follow this Illustration, choose Start Data load Immediately option and click the Start button. This will now extract data from .csv file and load it into the DataSource/PSA table.

Now click on the Monitor icon ( or F6) to check the status of data extraction. This opens up the Monitor InfoPackage screen which lists a variety of information related to the extraction like status, no. of records, time taken etc. The Details tab shows the extraction details broken into various steps as shown below. You can also click on PSA Maintenance icon ( or Ctrl+F8) to view/edit the actual data loaded into PSA.

3. Transformation

3.1 Create a Transformation - Transformation object specifies the rules on how the source object fields should map to data target fields. There are multiple transformation options. For this illustration Direct Assignment is used. Below are the steps to create a transformation.

In DWW > Modeling > InfoProvider, locate & right click the DSO and select Create Transformation... option

In the Create Transformation pop-up, specify the source and the target for the transformation. Source being the DataSource we just created and the target being the DSO. Then click the Continue button ().

Change Transformation screen opens up. On the left is the source object and on the right is the target object for the transformation. BW system automatically matches the InfoObject fields. For those unmapped, click & drag the field from source and drop it on appropriate field at the target. The default transformation rule is Direct Assignment. This can be changed to required type by right clicking on the line and providing alternate options in the pop-up that opens.

Now Save, Check( or Ctrl+F2) and Activate( or Ctrl+F3) the transformation.

This completes the Creation, Extraction and Transformation steps for a DSO.

4. Loading

4.1 Create a Data Transfer Process(DTP) - DTP controls the movement of data within BW system. It can read data from PSA or a data target, apply the transformation rules and load it to other data targets. Below are the steps to create, configure and execute a DTP.

In DWW > Modeling > InfoProvider, locate and expand the DSO tree. Right click the Data Transfer Process folder and select Create Data Transfer Process... option

In the Create Data transfer Process pop-up, Specify the DataSource as source of the DTP and click the OK () button.

Change Data Transfer Process screen opens up. In the Extraction tab, provide details about the DataSource for the DTP. For Extraction Mode, specify Delta as this ensures that system loads only the requests that are yet to be loaded into data target. Package Size determines the number of records to be extracted in a set.

In Update tab, specify the error handling settings. Select Update Valid Records, No Reporting (Request Red) option. This ensures that the erroneous records gets written to Error Stack and the only the valid records gets loaded into data target. But the entire set remains unavailable to reporting.

Now Save, Check( or Ctrl+F2) and Activate( or Ctrl+F3) the DTP.

In the Execute tab, there are options to specify Processing Mode. Select Serial Extraction, Immediate Parallel Processing option and click the Execute button. Data gets processed asynchronously in a background process.

Click on the Monitor Icon () to open the Monitor: Data Transfer Process page. This will list the status, duration and break down of all the request processing steps.

With this the data is loaded into the DSO. But the loaded data is stored in the Activation Queue and is not available for reporting.

5. Activation

In DWW > Modeling > InfoProvider, locate and right click the DSO. Select Manage option from the context menu.

Manage InfoProvider screen opens up. In the Requests tab, click the Activate button at the bottom of the screen.

A pop-up windows opens and lists all the requests that are loaded but not yet activated in the DSO. Select the request that needs to be activated and click the Start button at the bottom of the pop-up.

This will submit a data activation request that will be processed in background. To check the status of activation, go back and see the request row in Requests tab.
Once the data gets activated, i.e. once the data gets moved from Activation Queue to Active Data table within DSO, the data will be available for reporting and also for loading it into other data targets, usually InfoCubes.

6. Maintenance

6.1 Deleting Change Log data for Performance Improvement - As the DSO ages, the data in Change Log table piles up creating performance issues. So it is a good practice to free it up periodically. Below are the steps to delete data from Change Log table.

Go to Manage InfoProvider Screen for the DSO
In SAP GUI's menu bar, open Environment Menu and select Delete Change Log Data option

This opens Delete Change Log Data screen where you can select what data should be deleted based on how old the request is or when it was loaded. Enter the date and click the Start button to delete data before that date.

6.2 Request Automation

Go to Manage InfoProvider Screen for the DSO
In SAP GUI's menu bar, open Environment Menu and select Automatic Request Processing option

In the pop-up window that opens, you can enable/disable automatic setting status to OK and automatically activating requests. Click on the Save button after making the changes.

6.3 View DSO contents - At times there will be need to view the contents of DSO, be it, the activation queue table, active data table or the change log for administration and re-conciliation purposes. Below steps shows how to view the DSO contents.

Go to Manage InfoProvider screen for the DSO
In the Contents tab, there are 3 buttons, one each to view activation queue, active data and change log tables. Click on any of the button to view its content.

Click New Data button to view the loaded but not yet activated data. This opens the Data Browser screen as shown below. Click on the Number of Entries button to view the record count in the table.

Click on the Execute button (). Data Browser screen refreshes with the contents of DSO table as shown below.

This concludes the 2 Part document illustrating the steps for Creation, Extraction, Transformation, Loading, Activation and Maintenance of a Standard DSO.

All the steps discussed in Part 1 & 2 of this document can be automated using Process Chains.