OPH_AGGREGATE

Description

Type

Data Process.

Behaviour

It executes an aggregation function on a datacube with respect to explicit dimensions.

Parameters

  • cube: name of the input datacube. The name must be in PID format.
  • schedule: scheduling algorithm. The only possible value is 0, for a static linear block distribution of resources.
  • group_size: number of tuples per group to consider in the aggregation function. If set to “all” the aggregation, will occur on all tuples of the table.
  • operation: reduction operation. Possible values are “count”, “max”, “min”, “avg” and “sum”.
  • missingvalue: value to be considered as missing value; by default it is NAN (for float and double).
  • grid: optional argument used to identify the grid of dimensions to be used (if the grid already exists) or the one to be created (if the grid has a new name). If it isn’t specified, no grid will be used.
  • container: name of the container to be used to store the output cube; by default it is the input container.
  • check_grid: optional flag to be enabled in case the values of grid have to be checked (valid only if the grid already exists).
  • description: additional description to be associated with the output cube.

System parameters

  • exec_mode: operator execution mode. Possible values are async (default) for asynchronous mode, sync for synchronous mode with json-compliant output.
  • ncores: number of parallel processes to be used (min. 1).
  • nthreads: number of parallel threads per process to be used (min. 1).
  • sessionid: session identifier used server-side to manage sessions and jobs. Usually, users don’t need to use/modify it, except when it is necessary to create a new session or switch to another one.
  • objkey_filter: filter on the output of the operator written to file (default=all => no filter, none => no output, aggregate => shows operator’s output PID as text).

Examples

Compute the maximum values of 10-tuple groups in the datacube identified by the PID “URL/1/1”:

[OPH_TERM] >>  oph_aggregate operation=max;group_size=10;cube=URL/1/1;grid=new_grid;

Arguments

Argument name Type Mandatory Values Default Min/Max-value
sessionid “string” “no”   “null”  
ncores “int” “no”   “1” “1” /
nthreads “int” “no”   “1” “1” /
exec_mode “string” “no” “async|sync” “async”  
cube “string” “yes”      
schedule “int” “no” “0” “0”  
group_size “string” “no”   “all” “1” /
operation “string” “yes” “count|max|min|avg|sum”    
missingvalue “real” “no”   “NAN”  
grid “string” “no”   “-“  
container “string” “no”   “-“  
check_grid “string” “no” “yes|no” “yes”  
description “string” “no”   “-“  
objkey_filter “string” “no” “all|none|aggregate” “all”