Cleaning up execution data
As part of the Hyperscale execution run, the system will create data files (unload service) and masked files (masking service) on the file server. As the data size can be large (2 times of source data) and include sensitive information, therefore, it is important to clean up this data. Additionally, unload service, masking service, and load service will also store transient internal data for the execution while running it. This data is also not required once execution is completed and should be cleaned. Following are the three ways this data will be/can be cleaned.Following are the three ways this data will be/can be cleaned.
Using retain_execution_data
While setting up a Hyperscale Job (POST /jobs), you can set the value for retain_execution_dataproperty to the intimate system when it should clean up data automatically based on the table below.
EXECUTION_STATUS | RETAIN_EXECUTION_DATA | CLEAN UP AUTOMATICALLY? |
|---|---|---|
NA(SUCCESS/FAILED/CANCELED) | NO | YES |
SUCCESS | ON_ERROR | YES |
FAILED | ON_ERROR | NO |
NA(SUCCESS/FAILED/CANCELED) | ALWAYS | NO |
2. Manual clean up
Hyperscale exposes a delete API (DELETE /executions/{id}) to manually clean up data for execution if it's not already cleaned.
3. Start a new execution
While starting a new execution, Hyperscale will first validate if the previous execution data is cleaned. If it’s not cleaned, then Hyperscale will trigger cleanup before starting new execution.