Internal sorts

Procedure rdjexp can sort:

  • Input-Events
    This sort can be configured to match the physical properties of the Input-Events to be processed.
  • Output-Events
    The procedure enables this sort automatically when Output-Event aggregation is configured.

In either case, you must supply certain parameters that are common to both types of sort (See Common parameters).

Parameters specific to one type of sort are explained in Sort Input-Events and Sort Output-Events.

Common parameters

You must supply the parameters listed below if you implement one or both of the types of sort available (of Input-Events or Output-Events).

Parameter settings in script.ges and sys.dat

Keyword Description/Values
>ScriptConfiguration< Section
Memory_Sort

Amount of memory in KB to be reserved for the sort module
This amount of memory is allocated at the beginning of the sort and freed at the end
By default, 5000KB of memory is allocated

Length_Average_ISegt

Average length of an Input-Event segment (number of bytes)
If this value is incorrect, the system adjusts it automatically during the sorting process
However, it is recommended that you set a relevant size in order to optimize the sort configuration (See Recommendations for optimizing the sort in Sort properties)

Length_Average_OSegt

Average length of an Output-Event segment (number of bytes)
If this value is incorrect, the system adjusts it automatically during the sorting process
However, it is recommended that you set a relevant size in order to optimize the sort configuration

>Script.Ges< Section
Sort_Tmp1

Name of the first temporary work file

Sort_Tmp2

Name of the second temporary work file

Optimize the sort

Rules

The sort module uses both the reserved memory and the work files to sort the data:

  • Only use the second working file for large volume Input-Events.
    Use of this file in any other circumstances indicates the sort is incorrectly configured.
  • To reconfigure the sort, change the settings for the amount of memory and the number of segments.

Sort properties

The list of keys for the sort is deduced automatically from the current parameter settings. If two keys in the list are equal (same position, length and data type) the key having the lowest priority in the list of keys is deleted.

The sort is stable. This means that two segments with equal key values are returned in the order in which they are read.

The sort arranges the sorted elements in ascending order of the key fields. If a key field in a segment is incorrect, a low value is set for that key. This means that segments with empty key fields are to be found at the front end of the sorted segments.

The segments are sorted first on the primary keys. A key is said to be primary if it is applicable equally to all the segments.

Recommendations for Optimizing the Sort (UNIX and WINDOWS only)

The following formula gives you the minimum amount of memory needed consistent with minimal use of the temporary work files.

Memory_Sort in KB = SMAX * (AvLng + 0.234) / (200 * AvLng)

Where:

  • SMAX = Volume in KB of the total data to be sorted
  • AvLng= Average length in KB of the data segments to be sorted

Example:

If AvLng is 2 KB and SMAX is 512000 KB

Then Memory_Sort is at least 2590 KB.

Sort Input-Events

You can sort the Input-Events immediately after they are read in or after a call to the restructuring exit.

Sorting is recommended to optimize the loading of the rules and access to the parameter settings.

Rules for sorts

In any of the following situations, a sort is mandatory:

  • If all the segments of an Input-Event do not occur consecutively in an input file
  • If the Group Management option is enabled but either the Input-Events of a group do not occur consecutively in an input file or they are dispersed across more than one file from a single Processing Context-In
  • If the Input-Events are to be aggregated but the segments to be aggregated do not occur consecutively in the input file

This sort is only possible if: The session has an Input (E) step.

Parameter settings in script.ges and sys.dat

Keyword Description/Values

>Configuration< Section

I_Sort_IEvent

Yes

Length_Average_ISegt

Average size of an Input-Event segment (in bytes)
If this value is incorrect, it is adjusted dynamically during the sort
However, it is in your interest to give a figure as close as possible to the correct value as this optimizes the configuration of the sort
(See paragraph Optimize the sort)

Rules for defining the value of Length_Average_Segt:

Minimum Value Allowed Maximum Value Allowed Default Value
20 bytes

4000 bytes

500 bytes

>Script.ges< Section
I_Tmp_IEvent_Sort

Temporary file containing the sorted Input-Events (for a Processing Context-In) before aggregation

The sort keys for Input-Events

The table below lists, in decreasing order of priority, the keys you can specify for sorting Input-Event segments.

Order of Priority Field

1

The group code, if the Group Management option is enabled

2

The name of the Input-Event type

3

The version number of the Input-Event type

4

The instance code

5

The name of the phase that is in anomaly, if the option for phase-specific recycling is enabled

6

The segment name

7

The DAR (Date of Application for Rules)

The following only apply if the aggregation of Input-Events is implemented:

Order of Priority Field

8

The name of the aggregation rule applied to the segment
This is set in the preprocessing parameters for the Input-Event type

9

The start date of validity of the version of the aggregation rule applied to the segment
This is calculated from the value of the DAR

10

The values of the aggregation criteria. These are extracted from the segment in accordance with the version of the aggregation rule applied.

If one of these keys can be identified by context, it does not appear in the list of sort keys.

This list may be empty, if the aggregation of Input-Events is not enabled and if all the identifiers are defined by context.

Sort Output-Events

A sort of the Output-Events is enabled automatically when aggregation of the Output-Events is requested. It can be used to sort Output-Events to be aggregated either within a group or all groups together.

The following table lists the keywords that you need to configure.

Parameter settings in script.ges and sys.dat

Keyword Description/Values
>ScriptConfiguration< Section
Length_Average_OSegt

Average length of an Output-Event segment (in bytes)
If this value is incorrect, it is adjusted dynamically during the sort However, it is in your interest to give a figure as close as possible to the correct value so as to optimize the sort configuration

Rules for defining the value of Length_Average_OSegt:

Minimum Value Allowed Maximum Value Allowed Default Value
20 bytes

4000 bytes

500 bytes

>Script.ges< Section
P_Tmp_OSegt_Group

The name of the work file containing the Output-Events of a group to be aggregated

P_Tmp_OSegt_Sort_Group

The name of the work file containing the sorted Output-Events of a group to be aggregated

O_Tmp_OSegt_Sort_Session

Name of the work file containing the sorted Output-Events for aggregation by output

The sort keys

The table below lists, in decreasing order of priority, the keys you can specify for sorting Output-Event segments.

Order of priority Field

1

The name of the output associated with the Output-Event

2

The name of the aggregation rule applied to the Output-Event

3

The start date of validity of the version of the aggregation rule applied to the Output-Event
This is calculated from the value of the DAR

4

The values of the aggregation criteria
These are extracted from the Output-Event in accordance with the version of the aggregation rule applied

As with Input-Events, if the output can be identified from the context, it does not appear in the list of sort keys.

Related Links