1. Testing the reason why autopar failed after Graphite pass
I test with flag:
set args -O2 -fgraphite -ftree-parallelize-loops=4 -fdump-tree-parloops-details -fdump-tree-final_cleanup ../../gcc/testsuite/gcc.dg/autopar/parallelization-1.c
the autopar part will fail at function
parallelize_loops
:FOR_EACH_LOOP (li, loop, 0)The
{
htab_empty (reduction_list);
if ((/* Do not bother with loops in cold areas. */
optimize_loop_nest_for_size_p (loop)
/* Or loops that roll too little. */
|| expected_loop_iterations (loop) <= n_threads
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
expected_loop_iterations (loop)<= n_threads
fails. I think this might be caused by the not correct edge->count
and edge->frequency
when create_empty_loop_on_edge
in translate_clast
. And optimize_loop_nest_for_size_p (loop)
failed at some testcase, this might be caused by not correctly update of loop->header->frequency in Graphite./* TODO: Fix frequencies and counts. */So in the patch splitting autopar in a more clearer way We simply bypass this checking. It should be fixed maybe later.
freq = EDGE_FREQUENCY (entry_edge);
cnt = entry_edge->count;
2. Prepare the patch for splitting autopar
In a previous patch, I simply mark all the innermost loop parallel (introduce a bool flag can_be_parallel in loop structure). In this patch, we simply bypass the failed checking when this flag is set. And split autopar in a more clearer way : 1. Checking data dependency part 2. Code generation part
Now it is something like:
FOR_EACH_LOOP (li, loop, 0)3. Plan
{
htab_empty (reduction_list);
if (/* Do not bother with loops in cold areas. */
optimize_loop_nest_for_size_p (loop)
/* And of course, the loop must be parallelizable. */
|| !can_duplicate_loop_p (loop)
|| loop_has_blocks_with_irreducible_flag (loop)
/* FIXME: the check for vector phi nodes could be removed. */
|| loop_has_vector_phi_nodes (loop))
continue;
/* FIXME: Bypass this check as graphite doesn't update the
count and frequency correctly now */
if (!loop->can_be_parallel
&& (expected_loop_iterations (loop) <= n_threads
/* Do not bother with loops in cold areas. */
|| optimize_loop_nest_for_size_p (loop)))
continue;
if (!try_get_loop_niter (loop, &niter_desc))
continue;
if (!try_create_reduction_list (loop, reduction_list))
continue;
if (!loop->can_be_parallel && !loop_parallel_p (loop))
continue;
changed = true;
gen_parallel_loop (loop, reduction_list, n_threads, &niter_desc);
- Regression test for this patch on trunk
- Write testcases for code generation part, make sure it works correct after Graphite
No comments:
Post a Comment