Skip to content. | Skip to navigation

Personal tools
You are here: GES DISC Home Additional Features Technology Lab S4PM 5.9.0 RELEASE NOTES

S4PM 5.9.0 RELEASE NOTES

Release Date

January 10, 2006

Functional Changes

  1. New S4PM Test Package - A new optional package has been added to S4PM in addition to the mandatory packages: S4P, S4PM, and S4PM_CFG. This package, S4PM_TEST, contains pre-configured Stringmaker configuration files and two synthetic algorithms. It is intended as a test package for validating a new S4PM installation. It can also function as a working example for new users of S4PM. A single command will create a new S4PM string configured with the two algorithms, start up the S4PM Monitor, start up the stations, and run the algorithms therein. Upon successful completion, the string is shut down.

  2. Stringmaster Removal - Stringmaster has been deprecated for some time and has now been dropped from the S4PM baseline. All users should now use Stringmaker, its successor.

  3. Generalization of Stringmakerall - The Stringmakerall utility has been generalized so that it works at any site. In previous releases this utility, which runs Stringmaker on all strings on a particular machine, was too specific for the GES DISC environment where it was born.

  4. Fix To Algorithm Info Tool - In previous releases, the Algorithm Information tool (available via the S4PM Monitor) displayed information on all algorithms installed in the string. This did not equate to which were actually configured to be running in the string. This has been fixed. The display in this release only shows those algorithms that are currently configured to run in the string.

  5. Finish Incomplete Order Failure Handler - On-demand strings now contain a Finish Incomplete Order failure handler in the Track Requests station. If a user's order is only partially complete, an operator can select this failure handler to force the release of the part of the order that was successful. The parts that failed are left out of the order.

  6. New Smart Disk Allocation - Added support for "smart" disk allocation. This option is enabled or disabled via the new $smart_allocation parameter in the Stringmaker string configuration file. When enabled, the allocation of disk pool space is based upon actual file sizes rather than on a predicted (usually a maximum) file size. This is done by first making an allocation based on the predicted file size and then, once the file arrives or is created, adjusting that allocation based upon the actual file size. This feature is particularly useful in strings where file sizes vary greatly, such as in on-demand processing strings where most of the processing is devoted to subsetting or subsampling services. When enabled, smart allocation applies to all data types. Smart allocation is disabled by default. N.B.: When using smart disk allocation, use of the -g option is recommended in s4pm_tk_disk_alloc.pl, in order to show GB instead of files (which is typically miscalculated in this case).

  7. New Option for FTP Pull Expiration File - When a string is configured for its input being symbolic links to a FTP pull area in ECS Datapool, this new option will create a file of URs of data that can be safely deleted (expired) from the FTP pull area. It is assumed that some script (not part of S4PM) will read the file and perform the expiration.

  8. S4P::ResPool::create_pools - Added as a callable function (formerly only usable via the script cpooldb.pl).

  9. Allocation Skipped if Symbolic Link - In 5.8.1, data files that were actually symbolic links were assigned a small nominal file size of 2048 bytes overriding what ever was set for the maximum size set for a file of that file type. In this release, rather than allocating a small amount if disk space, the allocation process is instead skipped altogether.

  10. Bug Fix in S4PM::Algorithm - A bug affecting the file accumulation production rule was fixed in this release. The bug caused a failure in the Select Data station, specificially, in the s4pm_preselect_data.pl script for algorithms employing the %file_accumulation_parms parameter in the Stringmaker algorithm configuration file.

  11. Bug Fix in S4PM - A bug was fixed in the make_patterned_filename() function whereby the day of year portion of the production date/time part of the data file name (the ~N component) wasn't getting padded properly with zeros for days of the year prior to 100. This caused data file names assigned by the Prepare Run station to be incorrect. The error propagated to the Allocate Disk station which couldn't properly parse these incorrect file names to locate the data type. The result was that Allocate Disk continuously recycled work orders and no disk was allocated. This problem has now been fixed.

  12. Fixes for Data Mining Edition - A number of fixes were made in support of S4PM Data Mining Edition (DME).

  13. View Disk Allocations and Usage Tool - In on-demand strings by default, the View Disk Allocations and Usage tool no longer displays the INPUT disk pool since all inputs are only symbolic links. In addition, the OUTPUT disk pool is shown as number of gigabytes rather than number of files. These changes were put in since, by default again, on-demand strings are configured with smart disk allocations turned on.

  14. Work Order Name Refactoring - A number of work order names have been changed to better associate them with station names. The table below shows the old and new work order names:

    Old Work Order Name New Work Order Name
    TRIGGER_DATA REGISTER
    NEWDATA_algorithm SELECT_algorithm
    MOREDATA_algorithm FIND_algorithm
    TRIGGER_algorithm PREPARE_algorithm
    GETDISK_algorithm ALLOCATE_algorithm
    CLEAN SWEEP

Detailed File Changes

Algorithm.pm

  • Changed NEWDATA work order name to SELECT in prolog.

  • Bug fix in accumulation_start() for handling the file accumulation production rule. A hash reference was made to attribute 'boundary' when it should have been 'window_boundary'. This was corrected.

Blocking.pm

  • Changed NEWDATA work order name to SELECT in prolog.

cpooldb.pl

  • Modified to call S4P::ResPool::create_pools() to do most of the work.

FileGroup.pm

  • Modified signature of data_version( $version ) to data_version($version, [$format] ) to accept an option format argument.

OdlTree.pm

  • Modified the lexer dictionary to account for "GROUPTYPE" and "OBJECTTYPE" keywords.

ResPool.pm

  • Function create_pools() has been added to do disk pool database creation. It calls S4P::ResPool::write_to_pool to do most of its work.

  • Function rite_to_pool has been modified to allow for arrays of pools to be updated, and to allow an $init argument that deletes all the existing pools before writing.

s4p_import.pl

  • Changed work order name MOREDATA to FIND in prolog.

s4p_subscribe.pl

  • Added an option (-d) to watch a queue (directory) of PDRs.

S4PM.pm

  • Added free_disk(), a routine extracted from s4pm_sweep_data.pl. This subroutine frees disk from disk allocation pool. If the $actual_size parameter is 0, the maximum size (set in the Stringmaker data types configuration file) is used. If the $actual_size parameter is 1, the actual size of the file or set of files is used. For this case, the actual size may be passed in as an argument just in case the file has already been deleted from disk, as is the case with s4pm_sweep_data.pl. If the $actual_size parameter is 2, the difference between the maxsize and the actual size on disk is used. This is used by s4pm_register_data.pl. Thus for input files, two deallocations from the original maxsize allocations are done: the first is to deallocate "unused" space when we determine that the actual file is smaller than the maxsize. The second is to deallocate the rest of that allocation, i.e., the actual size itself, when the file is removed.

  • Bug fixed in make_patterned_filename() whereby the day of year portion of the production date/time (the ~N component) wasn't getting padded properly with zeros for days of the year prior to 100.

s4pm_airs_L0_check.pl

  • Changed NEWDATA work order name to SELECT.

s4pm_delete_expired_data.pl

  • Changed CLEAN work order name to SWEEP.

s4pm_derived.cfg

  • Deprecated and REMOVED from the baseline.

s4pm_failed_find_data_handler.pl

  • Changed MOREDATA_algorithm work order name to FIND_algorithm.

s4pm_failed_order_handler.pl

  • This script allows partially successful orders in on-demand to be released to the user. The script had been part of the baseline for on-demand processing, but fell out of the baseline due to some testing issues. It has been brought back into the baseline.

s4pm_failed_pge_handler.pl

  • Changed CLEAN work order name to SWEEP.

s4pm_failed_qc_handler.pl

  • Changed TRIGGER_DATA work order name to REGISTER.

  • Changed CLEAN work order name to SWEEP.

s4pm_failed_service_handler.pl

  • Changed CLEAN work order name to SWEEP.

s4pm_find_data.pl

  • Changed MOREDATA work order name to FIND.

  • Changed TRIGGER_algorithm work order name to PREPARE_algorithm.

s4pm_global.cfg

  • Deprecated and REMOVED from the baseline.

s4pm_insert_datapool.pl

  • Changed MOREDATA_algorithm work order name to FIND_algorithm.

s4pm_max_children.cfg

  • Deprecated and REMOVED from the baseline.

S4PMOD.pm

  • Changed CLEAN work order name to SWEEP.

s4pm_pge_esdt.cfg

  • Deprecated and REMOVED from the baseline.

s4pm_poll_data.pl

  • Changed TRIGGER_DATA work order name to REGISTER.

s4pm_prepare_run.pl

  • Changed TRIGGER_algorithm work order name to PREPARE.

  • Changed GETDISK_algorithm work order name to ALLOCATE_algorithm.

s4pm_preselect_data.pl

  • Changed NEWDATA work order name to SELECT.

  • Changed MOREDATA work order name to FIND.

  • Modified some code to account for .xml files in Data Mining strings.

s4pm_purge_bad_data.pl

  • Changed TRIGGER_algorithm work order name to PREPARE_algorithm.

  • Changed CLEAN work order name to SWEEP.

s4pm_receive_dn.pl

  • Changed TRIGGER_DATA work order name to REGISTER.

  • Changed MOREDATA_algorithm work order name to FIND_algorithm in prolog.

  • Changed TRIGGER_algorithm work order name to PREPARE_algorithm.

s4pm_register_data.pl

  • Changed NEWDATA_algorithm work order name to SELECT_algorithm.

  • Changed TRIGGER_DATA work order name to REGISTER.

  • Added a new argument, -d allocation_db_file, which tells the program to deallocate the difference between the maxsize and actual file size from the disk pool.

s4pm_regular_block.pl

  • Changed NEWDATA work order name to SELECT in prolog.

s4pm_request_data.pl

  • Added option -l to skip disk allocations for all data types (it is assumed that all are symbolic links and, therefore, no allocations are necessary).

s4pm_run_algorithm.pl

  • Changed TRIGGER_DATA work order name to REGISTER.

  • Changed CLEAN work order name to SWEEP.

s4pm_s4pa_subscription.pl

  • Changed TRIGGER_DATA work order name to REGISTER.

s4pm_select_data.cfg

  • Changed NEWDATA_algorithm work order name to SELECT_algorithm.

  • Changed MOREDATA_algorithm work order name to FIND_algorithm.

s4pm_select_data.pl

  • Changed NEWDATA_algorithm work order name to SELECT_algorithm.

  • Changed MOREDATA_algorithm work order name to FIND_algorithm.

s4pm_split_services.pl

  • Changed MOREDATA_algorithm work order name to FIND_algorithm.

s4pm_stage_for_pickup.pl

  • New script, but it's basically a fork of s4p_import.pl tailored for S4PM Data Mining strings (it uses copy rather than FTP, for instance).

s4pm_static.cfg

  • Deprecated and REMOVED from the baseline.

s4pm_string.cfg

  • Deprecated and REMOVED from the baseline.

s4pm_stringmakerall.pl

  • Generalized so that Stringmaker string configuration files don't have to adhere to a restricted file naming convention. In previous releases, the file names had to be S4PMnn_xx_yy where nn represented the machine name (relevant only for the GES DISC), xx represented the string ID (e.g. MO for MODIS, AI for AIRS), and yy represented the instance (e.g. RE for reprocesing, FW for forward). In this version, any file naming convention is supported.

s4pm_stringmaker_algorithm.cfg

  • Changed NEWDATA_algorithm work order name to SELECT_algorithm.

  • Changed MOREDATA_algorithm work order name to FIND_algorithm.

s4pm_stringmaker_derived.cfg

  • Changed TRIGGER_DATA work order name to REGISTER.

  • Changed NEWDATA_algorithm work order name to SELECT_algorithm.

  • Changed MOREDATA_algorithm work order name to FIND_algorithm.

  • Changed TRIGGER_algorithm work order name to PREPARE_algorithm

  • Changed GETDISK_algorithm work order name to ALLOCATE_algorithm.

  • Changed CLEAN work order name to SWEEP.

  • Made fix where the checksums directory is created. Replaced '$config_files{"$sta/checksums/"} = 1;' with '$config_files{"$sta/checksums/"} = 0;'

  • Added s4pm_failed_order_handler.pl as a failure handler in the Track Requests station.

  • Modified the settings of %cfg_failure_handlers hash so that ad hoc failure handlers could be added via the Stringmaker string configuration file. The way it was constructed before, the failure handlers set in this file would override those set in the string configuration file rather than being added to them.

  • Added support for the new $smart_allocation parameter that enables or disables disk allocations based on actual file sizes.

  • Added support for the new $input_symlink_expiration_file in the Sweep Data station. In doing so, moved the setting of %cfg_commands for this station completely out of the s4pm_stringmaker_static.cfg to this file.

  • Modified Allocation Disk station so that, for on-demand strings, the 'View Disk Allocation and Usage' tool is configured without the INPUT disk pool (since they're just full of symbolic links) and the OUTPUT disk pool is shown with units of Gigabytes instead of number of files.

  • Modified the main executable name in the Stage for Pickup station changing it from s4p_import.pl to s4pm_stage_for_pickup.pl.

s4pm_stringmaker_static.cfg

  • Changed CLEAN work order name to SWEEP.

  • Moved the setting of %cfg_commands for this station completely out to the s4pm_stringmaker_derived.cfg file.

  • Got rid of the hardwired and superfluous offset times set in the Register Data station. These offsets have been getting set in the s4pm_stringmaker_derived.cfg file.

  • Removed from the Allocate Disk station the setting of 'View Disk Allocation and Usage' tool via the %cfg_interfaces parameter. It was moved to the s4pm_stringmaker_derived.cfg file instead.

s4pm_stringmaker_string.cfg

  • Changed MOREDATA_algorithm work order name to FIND_algorithm.

  • Added the parameter $input_symlink_expiration_file along with a description.

s4pm_stringmaster.pl

  • Deprecated and REMOVED from the baseline.

s4pm_stringmasterall.pl

  • Deprecated and REMOVED from the baseline.

s4pm_sweep_data.pl

  • Changed CLEAN work order name to SWEEP.

  • Added a new argument, -a, which tells the program to deallocate the actual file size, instead of the maxsize. This needs to be used in conjunction with the -d option in s4pm_register_data.pl to make the final deallocation come out right.

  • The disk deallocation has been moved and modified heavily, to S4PM::free_disk().

  • Added a new argument, -expire expiration_file, which if set, will cause the program to write the URs of data being swept to that file. This is intended to allow an external script to manually expire data staged for on-demand before its default timeout.

s4pm_tk_admin.pl

  • Changed NEWDATA_algorithm work order name in code to SELECT_algorithm.

  • Changed MOREDATA_algorithm work order name to FIND_algorithm.

  • Changed TRIGGER_algorithm work order name to PREPARE_algorithm.

  • Changed GETDISK_algorithm work order name to ALLOCATE_algorithm.

  • Changed CLEAN work order name to SWEEP.

s4pm_tk_algorithm_info.pl

  • Modified to screen out algorithms that are not currently configured to run in the string. It does this by checking the station.cfg file in the Run Algorithm station.

s4pm_tk_disk_alloc.pl

  • Added -g <format> option to display GB instead of files.

  • Added -x <pool,pool,...> option to exclude one or more pools from display. This is recommended for smart disk allocation with symbolic links. Such pools are essentially not managed for disk allocation.

s4pm_tk_trmon.pl

  • Changed TRIGGER_algorithm work order name to PREPARE_algorithm.

  • Changed GETDISK_algorithm work order name to ALLOCATE_algorithm.

s4pm_track_data.pl

  • Changed CLEAN work order name to SWEEP.

s4pm_track_requests.pl

  • Changed TRIGGER_algorithm work order name to PREPARE_algorithm.

  • Changed GETDISK_algorithm work order name to ALLOCATE_algorithm.

stationmaster.pl

  • Changed GETDISK_algorithm work order name to ALLOCATE_algorithm in prolog.

Subscription.pm

  • Modified to allow regular expressions for specifying data version in subscriptions configuration.

  • Inlcuded quoting of arguments in mail command.

TestSupport.pm

  • New module has been added to assist in the creation of test data suites for automated testing. Its reliability is as yet uncertain.

 

Document Actions
NASA Logo - nasa.gov
NASA Privacy Policy and Important Notices
Last updated: Dec 08, 2010 11:06 AM ET
Top