Professional Documents
Culture Documents
rich in features and has the capacity to process several terabytes of data without a sweat. As data volumes
and size is growing exponentially, customers/corporate ITs are now in various stages of considering
migrating to SAP on HANA.
While building the business case for such migration, we need to most definitely consider implementing
some proven best practices to bring down the overall cost of such migration. One such best practice is the
archiving of SAP data before the migration. With the larger size of structured and unstructured data, the
need to archive SAP data is very critical but not often discussed best practice.
There are several reasons why this is a best practice to implement now:
>SAP Charges on HANA based on volume: The more data you’ve got to process, the more expensive it’s
going to be.
>Data supposed to grow will be a certain intrinsic value. Any data that is seldom used should be archived
to have a good performance for SAP HANA. Therefore, one needs to create a plan to archive old, not often
used data that is not critical to on-going real-time operations. A spare system always brings down costs at
the time of SAP migration.
SAP HANA stores data on a persistence layer (data and logs). When you start your database the data is
loaded into the physical memory. Not all the data is loaded into the memory at this moment. Only those
tables that have the PRELOAD flag enabled will be loaded into the memory. It is also possible to enable
the PRELOAD flag just for some columns of the table.
The SAP HANA database will load tables and columns into memory when those are required. This could
happen because of explicit access, explicit load, index load/recreation after optimizing compression, etc.
For example, if I execute a SELECT * FROM TABLE and I have enough memory that table will be fully
loaded into memory in case it wasn’t. If it is a huge table (like tens or hundreds of GB) then OOM
situations could appear.
On the contrary, HANA unloads tables from memory in certain situations. For example, when the SAP
HANA database runs low on memory, on MERGE/SHRINK situations or when the UNLOAD of the table
was explicit. A high number of loads/unloads could affect the performance of the system since the
database have to bring the table/columns before the process using it can access it. You can read a lot about
LOAD/UNLOAD operations in SAP HANA in the SAP Note 2127458 – FAQ: SAP HANA Loads and
Unloads.
Remember, in SAP HANA the size of a table usually means a memory requirement so the database can
load the table in memory. If the table doubles its size on the persistent layer that will mean that if we want
to load that table in the physical memory we will need a double amount of memory. This is not 100%
accurate but it helps to describe my idea:
Size of a table = Space on the persistent layer = Physical memory = Money
If we don’t archive or delete data from our database sooner or later we will run out of memory. When
HANA runs of memory when OOM situations start to appear and people scream because the process is
being canceled. Also, if we run Calculation Views on the SAP HANA database (for example from a BW
Load Chain) the number of rows managed by the calculations views will increase. This means that the
calculation view will take more memory to be processed, more time to finish, etc.
An enormous amount of data growth in SAP Systems causes degradation of system performance and user
productivity. high costs due to larger storage volumes with high redundancy as well as administration costs
due to long backup and maintenance windows. Nevertheless or furthermore being said, larger databases
cause longer system downtimes during upgrades and restorations.
Data Archiving, in general means moving huge volumes of the data that is no longer required in the
database to some file system or any third-party storage system. It is functionality provided within SAP
applications for the consistent removal of data objects from database tables of the SAP database, where all
table entries that characterize a data object are written to an archive file outside the database.
Data Archiving only “frees up” space in the database during the archive delete job
Index Reorganization
Tablespace Reorganization
“Archiving” refers to the process associated with copying data and supporting documents from an active
system (SAP) to an external source (Open Text) for the purpose of deletion and/or storage for later
retrieval.
Here in my blog post will be covering only ‘Classical Data Archiving’ at present scenario:
Note: The logical file name must already have been maintained with the transaction FILE (Definition of
logical file paths).
Delete Jobs
> Do Not Schedule
The delete jobs are not started automatically.
Sequence
This time saving in the content repository is determined by the sequence in which the archive files are
processed after their creation:
Delete Phase before Saving
The archive files are saved in the content repository after the delete program has processed the file
in production mode. If the delete program is in test mode, there is no automatic save afterward.
Save before Delete Phase
The archive files are saved in the content repository after the write program has created an archive
file before the delete program starts. The delete program can only process the file after it has been
saved. The Delete Program Reads from Storage System flag controls whether the delete program
reads the data to be deleted from the archive file in the storage system, or in the file system.
Note: If Start Automatically is selected in the delete program settings, the delete program is called after
saving the file. It makes no difference whether the delete program is in test or production mode.
Analogously, the delete jobs are scheduled and started automatically after the event, if After Event is
selected.
Save your entries and go back to the initial screen of the transaction AOBJ.
Important settings to set
Perform a DVM (Data Volume Management) Analysis Report on a Solution Manager 7.2 for the
SAP HANA Database and the SAP system using the database. Basically the Solution Manager
will connect to the database, check the biggest tables and provide some recommendations that we
can follow to reduce its size. The DVM configuration is super easy, just follow the wizard on the
Solution Manager.
Check the biggest tables in the SAP HANA Database and find notes and procedures to
archive/delete data. We can do it in different ways:
o Using transaction DB02 and the option System Information – Large Tables. Order by
Estimated max Memory Size in Total and Top Space Consumers.
o Using the SAP HANA Mini Checks in the SAP HANA Studio.
o Using Transaction TAANA (which stands for Table Analysis: Administration) helps
identify the distribution of data within a table. TAANA can also identify the volume of
archivable data and any archive file routing requirements.
o Using Transaction DB15 is used to identify Archive Objects. If you know the database
tables or if you know the archiving object it displays the list of associated database tables
for that particular Archiving Object.
o Also, analyze performance with transaction ST03N for possible archiving objects. E.g.,
Material documents table MSEG can be a very large table, and archiving MM_MATBEL
can improve runtime for MB51).
Tx. DB02:
Tx. DB15:
Configuration of Archive Objects
For creating custom Archive objects, go to Tx. AOBJ -> Click on ‘New Entries’.
1. Structure Definition
The Structure definition contains the list of the database tables from which the data will be archived. This
is pre-configured for the Standard SAP Archive Objects.
This contains the list of Database tables from which the data will be only deleted and not archived. This is
pre-configured for the Standard SAP Archive Objects.
This contains the list of pre-requisite Archived Objects that need to be archived before the actual archive
objects. This is pre-configured for the Standard SAP
4. Customizing Settings
This should be configured as per business requirements. This contains details about Archive File Size,
Setting for Delete programs like Test Mode Variants and Live Mode Variant (i.e., Production Mode
Variant), Content Repositories, and the sequence of deletion If the radio button ‘Start Automatically’ for
Delete Jobs is selected, automatically the delete program is executed after the write program. If not
selected, execute the delete manually for the particular archiving object.
This contains the details about the Archiving classes used by the particular Archiving Object.
6. Read Programs
This contains the details about the read programs which is used by both the archiving objects and archiving
class to read the data from the Archive Files.
This contains the transaction code for the application-specific customizing for the relevant archiving
object. Once this code is entered, you can go directly from transaction SARA to the application-specific
Customizing transaction, which is often used for entering residence times for an archiving object.
This contains details about the customizing code (user exists / Business add-in) in the archive and delete
programs.
Archive Programs
The mentioned programs in the SAP system should be assigned to the Archiving Objects.
1. Preprocessing (Optional)
This program prepares the data for archiving making the data setting for deletion indicator to X, but it does
not delete any data from database. It just operates with database.
2. Write
It creates new archive file and writes the data in them. At this point, no data is been deleted from the
database. The write programs can be executed in two processing modes.
Test mode
Production mode
In Test mode, no archive files will be created whereas in Production mode, Archive files will be created.
3. Delete
This program reads the data from the archived files and deletes the data from the database. The delete
programs can be executed in processing modes:
Test Mode
Production Mode
In the Test Mode, the log after the execution shows the entries of the data to be deleted from the database,
whereas in the Production Mode it shows the statistics of the deleted data from the database.
It also operates on the Database and does not require any Archive files. This is final program can be
executed asynchronously with the delete program.
This program is used to reload the archived data from the external storage system back into the respective
SAP Database tables. It is not available for all the archiving objects.
This program builds or deletes an index that allows individual access. Infostructure created for archive
objects (via SARI) acts as index to the archived data.
Check Archivability
Checking the Archivability of the Business objects precedes the actual archiving process. This should
ensure that the data is not archived if some other application still needs it. The business objects are
considered to be archivable if it:
Residence Time: Amount of data which will be available in online before it meets Archivability criteria.
Retention Time: It is the entire time where data spends in Archive format before it gets deleted from the
actual database.
Setting up connection between the ERP System and the External Storage
System
1. Maintain the Content Repository (OAC0)
The contain repository is maintained for every archiving object. Below is the content repository maintained
for MM_EKKO.
2. Maintain the Document Type (OAC2)
The document type is maintained by Tx. OAC2. Every Archiving Object is associated with the document
type which is in turn linked to document class. The document class identifies the archive format for the
document in the Content server.
Archived data is stored in archive files in the file system/3rd Party Storage (Like PBS Software etc.)
If an Archive file is created by the write program, then it’s a physical file name is derived from a logical
file name.
In the same way, the physical pathname is derived from a logical pathname. The logical file name and the
logical path should be maintained to get an appropriate physical file name and physical pathname.
First, determine the target directory in which you want to create the archive files of a certain archiving
object. The physical name of the directory is stored in a logical pathname.
Suppose, You have created a subdirectory called “archiving” for archive files in the global directory. The
physical pathname should be defined as follows:
5. Give the meaning for the variant here and click on “save”
11. Repeat the same procedure from 2 to 10 for Archiving object MM_EKKO (Purchase Documents)
using write program by maintaining the Variant ‘TEST2’ along with same selection criteria as given for
preprocessing program and click on ‘Execute’ button.
ARV_MM_EKKO_WRI*******
For remaining storing and deleting jobs are performing with the same process as did for preprocessing and
write programs. Below are the jobs where it is triggered for remaining programs for various Archive
objects MM_EBAN, MM_MATBEL and completed successfully.
13. Select the Job and Click on the Spool to view the output.
Users can access the archived data by Archive Information system (Direct Tx. SARI).
Users can check the Archive Information system by clicking on the ‘Information System’ button in the
Archive Administration (Tx. SARA).
Creating an Infostructure
Every Archive file accessed using Archive Information System is through Infostructure. Every
Infostructure belongs to a unique Archiving objects and also refers to the field catalog. A field catalog is
the collection of fields suitable for indexing the archive files of Archiving object concerned. All the data
related to Infostructure is maintained in database tables.
For creating an Infostructure, we use Tx. SARJ, or clicking on “Customizing” button in the Archive
Information system: Central Management.
Note: The standard Infostructure SAP_DRB_MM_EKKO is used here.
Activating an Infostructure:
To use an Infostructure, the user must activate the Infostructure. All the standard Infostructure will be
already activated. Only after activating the Infostructure it can be filled with data from an archive file and
evaluated.
Evaluating an Infostructure:
The data from the archive files can be retrieved using “Archive explorer” by Tx. SARE, or by clicking on
“Archive Explorer” button in the Archive Information System.
Conclusion
This blogs gives detail process flow “Strategy” and configuration steps “Approach” which overall portrays
designing part of Data Archiving.
If the data is archived in accordance with rules (Archiving Project, Basic principals and Country specific
legal regulations) are met, the write program stores the archivable data from the operational database in
archive files. You can display this data using various read programs.
You can recover the archived data using reload programs. During the process, the reload program reads the
data from the archive files and then reinserts it into the operational database. However, this is only
permitted in an emergency (for example, if the Customizing settings are incorrect or there are technical
problems). To avoid misuse of these functions and the risk of inconsistencies occurring through bringing
old data into the operational system, you can only reload archived data to the operational database within a
period of five days after successful archiving.