# SuanShu, a Java numerical and statistical library

com.numericalmethod.suanshu.grid.executor

## Class DefaultGridExecutorFactory

• public class DefaultGridExecutorFactory
extends Object
The default factory that creates instances of GridExecutor.

getInstance() returns a factory that configures GridExecutors with settings specified in a configuration file. By default, it is read from grid.xml on the classpath.

Here is a sample grid.xml:

 <?xml version="1.0"?>
<grid
xmlns="http://www.numericalmethod.com/grid"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.numericalmethod.com/grid http://www.numericalmethod.com/xsd/grid.xsd">
<!-- Uncomment / Comment out the configuration to switch between local and remote operation -->
<!-- CHOICE 1: LOCAL EXECUTION -->
<!--
<local>
<concurrency>4</concurrency>
</local>
-->
<!-- CHOICE 2: REMOTE EXECUTION -->
<remote>
<hostname>192.168.1.101</hostname>
<slaves>
<!-- one or more slaves -->
<slave>
<hostname>192.168.1.102</hostname>
<port>2552</port>
</slave>
<slave>
<hostname>192.168.1.103</hostname>
<port>2552</port>
</slave>
</slaves>
<!-- Optional: remote execution parameters -->
<queueSize>2</queueSize>
<failureDetection>
<timeout>5000</timeout>
<scanInterval>1000</scanInterval>
<maximumRetries>3</maximumRetries>
</failureDetection>
</remote>

<!-- Optional: Random Number Generator Configuration -->
<rng>
<repeatable>false</repeatable>
<dynamicCreator>
<dcSeeds>1234 5687</dcSeeds>
<mtSeeds>9876 5432</mtSeeds>
</dynamicCreator>
</rng>
</grid>


### Local Execution

To use local CPUs for computation, use <local> configuration in which <concurrency> specifies the number of threads to use, where 0 is a default value for using default concurrency, i.e., the number of available processors (see getDefaultConcurrency()).

### Remote Execution

In order to use remote computers for parallel computation, you need to specify <remote> section in the configuration file.

You need to enter the hostname of the master machine that the code is run on. It is the address that the remote machines refer to the master machine. Please ensure that the address is appropriate for the application, e.g., only use a local network address, if the distributed computation is only run on the local network.

Then you can configure one or more remote slave machines, to which to connect via the given address and port respectively.

Optionally, you can set the size of the queues on slave machines. The queue size sets, how many jobs are kept in flight per slave machine. By default, all jobs will be sent out immediately, which may be undesirable when sharing the remote clients among applications.

This will also impact failure detection (below): A large size will increase the cost of failure, whilst a small size can decrease the performance due to slave machines being idle, whilst waiting for more work. Two would be a good size for tasks that take longer than a network round-trip (e.g. a second or longer).

You can turn failure detection on by providing a <failureDetection> element with the following options:

• Time-out and scan-interval affect when and how failures are detected. Periodically (set by the scan-interval), the master checks for timed-out jobs. A job is timed-out if it has been sent a longer time ago than the specified timeout value.
• The maximum retries is the number of times a particular job will be retried until it is considered broken and null is put in the result list. A good value for this will depend on what your requirements are for liveness, the chances of a job always throwing an exception and the impact of having a null value in the response set.
Note that if the job has to queue, this will cause it to take longer. Hence, you must make sure that the time-out value takes into account the time you expect the job to take, the target queue size and whether the slave machines are busy. Setting this to too low a value will cause thrashing, by spamming the kernels with duplicate work messages!

### Random Number Generation

In the next optional section you may specify the seed(s) used for the DynamicCreator algorithm that generates unique MersenneTwister random number generators and the seeds for the random number generators that are created. If this section is not provided, the seeds will be chosen randomly on each execution.

The framework can be switched to a mode, where it produces strictly repeatable results, even when using RandomizedFunctions. This approach will, however, defeat failure detection because tasks cannot be re-sent. Please also remember to provide seeds in the section above. For best performance, this option should remain switched off!

• ### Field Summary

Fields
Modifier and Type Field and Description
static String DEFAULT_CONFIG_PATH
The path (relative to the classpath), where the framework looks for the configuration by default.
static String FALLBACK_CONFIG_PATH
When the default configuration file cannot be found, this fallback configuration file will be used.
• ### Method Summary

All Methods
Modifier and Type Method and Description
static int getDefaultConcurrency()
Gets the default concurrency (number of threads) for the local machine, that is, the number of available processors returned by Runtime.getRuntime().availableProcessors().
static GridExecutorFactory getInstance()
Gets an instance using the configuration found in grid.xml on the classpath.
static void setConfigurationPath(String path)
Sets the configuration path to the given path relative to the classpath.
• ### Field Detail

• #### DEFAULT_CONFIG_PATH

public static final String DEFAULT_CONFIG_PATH
The path (relative to the classpath), where the framework looks for the configuration by default.
• #### FALLBACK_CONFIG_PATH

public static final String FALLBACK_CONFIG_PATH
When the default configuration file cannot be found, this fallback configuration file will be used.
• ### Method Detail

• #### setConfigurationPath

public static void setConfigurationPath(String path)
Sets the configuration path to the given path relative to the classpath.

For most purposes, you should just place your configuration in the default location ('grid.xml' on the classpath). If you need to use different configurations for different parts of your program, consider using GridExecutorFactoryFromConfig directly.

Parameters:
path - the location of the configuration file
• #### getDefaultConcurrency

public static int getDefaultConcurrency()
Gets the default concurrency (number of threads) for the local machine, that is, the number of available processors returned by Runtime.getRuntime().availableProcessors().
Returns:
the default concurrency