Distributed Processing Basics
What is Distributed Processing?
A distributed processing system consists of a cluster of several processing 'Nodes' that can communicate with each other and coordinate computing tasks. In SURE, Distributed Processing (DP) is based on a hierarchical structure where one ‘Master’ node communicates with ‘Processing’ nodes in different machines within the cluster and assigns them tasks. In this way, SURE Distributed Processing divides large projects into smaller sub-projects and distributes them to the processing machines available.
It is worth noting that one can also benefit from DP using a single machine that houses the Master node and a processing node (referred to as Local-DP).
Benefits of Distributed Processing
Process large projects in Local DP (single-machine)
Easily scale-up production by adding Processing nodes on demand
Inspect initial results and perform quality assessment within 1 or 2 days
Minimize progress loss in case of machine restart, power outage, or network disruption
Reduce required disk space
SURE Distributed Processing Workflow
First
The Master:
Performs the Analysis step
Divides the project into smaller sub-projects
Assigns and sends the sub-projects to Processing Nodes
Then
The Processing Nodes:
Receive sub-project from Master
Process sub-project and return results to Master
Wait to receive the next sub-project
Lastly
Once all sub-projects are completed:
the Master merges sub-projects and performs final steps (depending on results requested)