OPH_REDUCE

Description

Type

Data Process.

Behaviour

It performs a reduction operation on a datacube with respect to implicit dimensions.

Parameters

  • cube: name of the input datacube. The name must be in PID format.
  • schedule: scheduling algorithm. The only possible value is 0, for a static linear block distribution of resources.
  • group_size: size of the aggregation set. If set to “all”, the reduction will occur on all elements of each tuple.
  • operation: reduction operation. Possible values are:
    • “count”: to evaluate the actual values (not missing)
    • “max”: to evaluate the maximum value
    • “min”: to evaluate the minimum value
    • “avg”: to evaluate the mean value
    • “sum”: to evaluate the sum
    • “std”: to evaluate the standard deviation
    • “var”: to evaluate the variance
    • “cmoment”: to evaluate the central moment
    • “acmoment”: to evaluate the absolute central moment
    • “rmoment” to evaluate the raw moment
    • “armoment” to evaluate the absolute raw moment
    • “quantile”: to evaluate the quantile
    • “arg_max” to evaluate the index of the maximum value
    • “arg_min” to evaluate the index of the minimum value
  • order: order used in evaluation of the moments or value of the quantile in range [0, 1].
  • missingvalue: value to be considered as missing value; by default it is NAN (for float and double).
  • grid: optional argument used to identify the grid of dimensions to be used (if the grid already exists) or the one to be created (if the grid has a new name). If it isn’t specified, no grid will be used.
  • container: name of the container to be used to store the output cube; by default, it is the input container.
  • check_grid: optional flag to be enabled in case the values of grid have to be checked (valid only if the grid already exists).
  • description: additional description to be associated with the output cube.

System parameters

  • exec_mode: operator execution mode. Possible values are async (default) for asynchronous mode, sync for synchronous mode with json-compliant output.
  • ncores: number of parallel processes to be used (min. 1).
  • sessionid: session identifier used server-side to manage sessions and jobs. Usually, users don’t need to use/modify it, except when it is necessary to create a new session or switch to another one.
  • objkey_filter: filter on the output of the operator written to file (default=all => no filter, none => no output, reduce => shows operator’s output PID as text).

Examples

Compute the maximum values of 10-element groups in the datacube identified by the PID “URL/1/1”:

[OPH_TERM] >>  oph_reduce operation=max;group_size=10;cube=URL/1/1;grid=new_grid;

Arguments

Argument name Type Mandatory Values Default Min/Max-value
sessionid “string” “no”   “null”  
ncores “int” “no”   “1” “1” /
exec_mode “string” “no” “async|sync” “async”  
cube “string” “yes”      
schedule “int” “no” “0” “0”  
group_size “string” “no”   “all” “1” /
operation “string” “yes” “count|max|min|avg|sum|std|var|cmoment|acmoment|rmoment|armoment|quantile|arg_max|arg_min”    
order “real” “no”   “2” “0” /
missingvalue “real” “no”   “NAN”  
grid “string” “no”   “-“  
container “string” “no”   “-“  
check_grid “string” “no” “yes|no” “yes”  
description “string” “no”   “-“  
objkey_filter “string” “no” “all|none|reduce” “all”