Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The OptimizeSCG transformation deletes regions from the SCG, which only consist of an enty and exit node. After the code generation, these regions are transformed to empty threads, which only consist of a label and are therefore rejected by a later compilation of the resulting SCL_P/C program. However, for real world examples, it is unlikely, that the user models a region without any further functionalty, therefore this step might be removed.

 

Before optimization:After optimization:
Image AddedImage Added

This figure shows that thread C has been removed. If only one thread is left, then the fork and join nodes are removed. If no thread is left, then the parent node from the fork node is connected to the child node of the join node.

NodePriorities:

This transformation step uses the results from the dependency analysis. It checks, whether a valid schedule for the SCG exists and calculates the node priorities afterwards. Therefore, the strongly connected components of the SCG are calculated, where the nodes of the SCG are the nodes of directed the graph which is connected by dependency and transition edges. Pause edges are ignored. If such a strongly connected component contains a dependency edge, the SCG is not schedulable. Otherwise, the SCG is schedulale and the node priorities, which are crucial for the schedule, can be determined. The priority of a node is the longest path originating from that node, where strongly connected components are considered as a single node and transition edges have weight 0  and dependency edges have weight 1. Again, pause edges are ignored. The theoretical foundation of this transformation step can be found here.

Image Added

The image shows the resulting node priorities for ABO (blue).

OptNodePriorities:

This transformation step is optional. It reduces the number of context switches. Surface nodes usually have the priority 0. Additionally, exit nodes usually have the same priority as the exit nodes of their sibling threads. The resulting node priorities of the ABO example shown in the previous transformation step illustrates this. However it is often unnecessary to perform a context switch, just because an exit or surface node appears, especially because in this case, the order in which the exit/surface nodes are scheduled does not matter. Therefore the exit and surface nodes are taken as starting point. The algorithm moves upward from those nodes and checks, whether any parent node with a higher priority exists. If more than one parent exists, the minimal priority is taken. The search terminates immediatly, if an entry, join or depth node is found and additionally, if a node has an incoming dependency. This is important, because otherwise, the order of the threads might be corrupted. Another restriction is, that the node, which is forked first, cannot perform a join, so if the node which is forked first is the node with the lowest exit priority after this optmization, the exit priority of another thread has to be reduced.

Image Added

ABO with optimized node priorities.

ThreadSegmentIDs:

SCL_P requires unique priorities for the schedule. In case, that nodes of different threads have the same priority, the thread segment IDs decide, which thread comes first. As described above, a thread, which forks other threads hands it's own prioID, and therefore it's thread segment ID over to the first child and inherits the prioID and therefore the thread segment ID from the child, which performs the join. The assignment of the thread segment IDs is done by a modified depth-first search, which stops at each join node, until its predecessors have received their thread segment ID. Because a prioID should be only lowered, the assignment algorithm starts with the highest thread segment ID, which is calculated by (number of entry nodes) - (number of forknodes). Although the thread segment ids are shown within the region, additionally the unoptimized prioIDs are shown within the graph. This is because a thread segment ID changes after a join and it should help the user to understand, which node belongs to which thread segment ID.

Image Added

ABO with thread segment IDs (black numbers in the region) and unoptimized prioIDs (red numbers at the nodes)

OptPrioIDs:

The unique prioIDs are calculated by the following formula:

...

The drawback of this formula is, that many prioIDs remain unused. This transformation compresses the prioIDs in use. If any prioID is unused, the thread with the next higher prioID gets that prioID. This makes the bookkeeping for SCL_P more compact.

Image Added

ABO with optimized prioIDs.

SCL_P:

The translation from the SCG with prioIDs to SCL_P is straightforward. Assignments and Conditionals are translated to their corresponding C-Code. Labels and gotos are used, if such a node is visited twice. Surface and their corresponding depth nodes are translated to pause statements. If the assigned prioID changes from one node to another, a prio(p) statement added. However, join, fork and entry nodes need more attention. For each fork node it is important to ensure, that the node, which has the highest prioID is translated first and the node which performs the join, which is the node with the lowest prioID assigned to its exit node is translated last. Additionally, for each forkn with n < 1 a corresponding macro has to be generated. Likewise, a macro needs to be generated, if joinn joins more than one prioID. If another thread has a higher prioID than the exit node of the joining thread, this prioID will be scheduled first and therefore, join does not have to wait for that prioID to finish. However it might happen, that a thread stops because of a pause statement, then the prioID indicated by the corresponding depth node has to be considered by the join. If the prioID of a parallel thread is lower than the prioID of the joining node, it has to be considered as well.

 

 

Packages belonging to this Project:

...