diff options
author | jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> | 2008-06-06 13:01:54 +0000 |
---|---|---|
committer | jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> | 2008-06-06 13:01:54 +0000 |
commit | fd6481cf2e4413bca3ef43b1e504e1c78de6025d (patch) | |
tree | 5d5537ea17855b77cca7b9c90a262e584c441592 /libgomp | |
parent | cbdcfa59ffeb7d51f7cbdfe64e1a99e43c82b2ac (diff) | |
download | gcc-fd6481cf2e4413bca3ef43b1e504e1c78de6025d.tar.gz |
* c-cppbuiltin.c (c_cpp_builtins): Change _OPENMP value to
200805.
* langhooks.h (struct lang_hooks_for_decls): Add omp_finish_clause.
Add omp_private_outer_ref hook, add another argument to
omp_clause_default_ctor hook.
* langhooks-def.h (LANG_HOOKS_OMP_FINISH_CLAUSE): Define.
(LANG_HOOKS_OMP_PRIVATE_OUTER_REF): Define.
(LANG_HOOKS_OMP_CLAUSE_DEFAULT_CTOR): Change to
hook_tree_tree_tree_tree_null.
(LANG_HOOKS_DECLS): Add LANG_HOOKS_OMP_FINISH_CLAUSE and
LANG_HOOKS_OMP_PRIVATE_OUTER_REF.
* hooks.c (hook_tree_tree_tree_tree_null): New function.
* hooks.h (hook_tree_tree_tree_tree_null): New prototype.
* tree.def (OMP_TASK): New tree code.
* tree.h (OMP_TASK_COPYFN, OMP_TASK_ARG_SIZE, OMP_TASK_ARG_ALIGN,
OMP_CLAUSE_PRIVATE_OUTER_REF, OMP_CLAUSE_LASTPRIVATE_STMT,
OMP_CLAUSE_COLLAPSE_ITERVAR, OMP_CLAUSE_COLLAPSE_COUNT,
OMP_TASKREG_CHECK, OMP_TASKREG_BODY, OMP_TASKREG_CLAUSES,
OMP_TASKREG_FN, OMP_TASKREG_DATA_ARG, OMP_TASK_BODY,
OMP_TASK_CLAUSES, OMP_TASK_FN, OMP_TASK_DATA_ARG,
OMP_CLAUSE_COLLAPSE_EXPR): Define.
(enum omp_clause_default_kind): Add OMP_CLAUSE_DEFAULT_FIRSTPRIVATE.
(OMP_DIRECTIVE_P): Add OMP_TASK.
(OMP_CLAUSE_COLLAPSE, OMP_CLAUSE_UNTIED): New clause codes.
(OMP_CLAUSE_SCHEDULE_AUTO): New schedule kind.
* tree.c (omp_clause_code_name): Add OMP_CLAUSE_COLLAPSE
and OMP_CLAUSE_UNTIED entries.
(omp_clause_num_ops): Likewise. Increase OMP_CLAUSE_LASTPRIVATE
num_ops to 2.
(walk_tree_1): Handle OMP_CLAUSE_COLLAPSE and OMP_CLAUSE_UNTIED.
Walk OMP_CLAUSE_LASTPRIVATE_STMT.
* tree-pretty-print.c (dump_omp_clause): Handle
OMP_CLAUSE_SCHEDULE_AUTO, OMP_CLAUSE_UNTIED, OMP_CLAUSE_COLLAPSE,
OMP_CLAUSE_DEFAULT_FIRSTPRIVATE.
(dump_generic_node): Handle OMP_TASK and collapsed OMP_FOR loops.
* c-omp.c (c_finish_omp_for): Allow pointer iterators. Remove
warning about unsigned iterators. Change decl/init/cond/incr
arguments to TREE_VECs, check arguments for all collapsed loops.
(c_finish_omp_taskwait): New function.
(c_split_parallel_clauses): Put OMP_CLAUSE_COLLAPSE clause to
ws_clauses.
* c-parser.c (c_parser_omp_for_loop): Parse collapsed loops. Call
default_function_array_conversion on init. Add par_clauses argument.
If decl is present in parallel's lastprivate clause, change it to
shared and add lastprivate clause for decl to OMP_FOR_CLAUSES.
Add clauses argument, on success set OMP_FOR_CLAUSES to it. Look up
collapse count in clauses.
(c_parser_omp_for, c_parser_omp_parallel): Adjust
c_parser_omp_for_loop callers.
(OMP_FOR_CLAUSE_MASK): Add 1 << PRAGMA_OMP_CLAUSE_COLLAPSE.
(c_parser_pragma): Handle PRAGMA_OMP_TASKWAIT.
(c_parser_omp_clause_name): Handle collapse and untied clauses.
(c_parser_omp_clause_collapse, c_parser_omp_clause_untied): New
functions.
(c_parser_omp_clause_schedule): Handle schedule(auto).
Include correct location in the error message.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_COLLAPSE
and PRAGMA_OMP_CLAUSE_UNTIED.
(OMP_TASK_CLAUSE_MASK): Define.
(c_parser_omp_task, c_parser_omp_taskwait): New functions.
(c_parser_omp_construct): Handle PRAGMA_OMP_TASK.
* tree-nested.c (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_LASTPRIVATE_STMT,
OMP_CLAUSE_REDUCTION_INIT, OMP_CLAUSE_REDUCTION_MERGE,
OMP_CLAUSE_COLLAPSE and OMP_CLAUSE_UNTIED.
Don't handle TREE_STATIC or DECL_EXTERNAL VAR_DECLs in
OMP_CLAUSE_DECL.
(conver_nonlocal_reference, convert_local_reference,
convert_call_expr): Handle OMP_TASK the same as OMP_PARALLEL. Use
OMP_TASKREG_* macros rather than OMP_PARALLEL_*.
(walk_omp_for): Adjust for OMP_FOR_{INIT,COND,INCR} changes.
* tree-gimple.c (is_gimple_stmt): Handle OMP_TASK.
* c-tree.h (c_begin_omp_task, c_finish_omp_task): New prototypes.
* c-pragma.h (PRAGMA_OMP_TASK, PRAGMA_OMP_TASKWAIT): New.
(PRAGMA_OMP_CLAUSE_COLLAPSE, PRAGMA_OMP_CLAUSE_UNTIED): New.
* c-typeck.c (c_begin_omp_task, c_finish_omp_task): New functions.
(c_finish_omp_clauses): Handle OMP_CLAUSE_COLLAPSE and
OMP_CLAUSE_UNTIED.
* c-pragma.c (init_pragma): Init omp task and omp taskwait pragmas.
* c-common.h (c_finish_omp_taskwait): New prototype.
* gimple-low.c (lower_stmt): Handle OMP_TASK.
* tree-parloops.c (create_parallel_loop): Create 1 entry
vectors for OMP_FOR_{INIT,COND,INCR}.
* tree-cfg.c (remove_useless_stmts_1): Handle OMP_* containers.
(make_edges): Handle OMP_TASK.
* tree-ssa-operands.c (get_expr_operands): Handle collapsed OMP_FOR
loops, adjust for OMP_FOR_{INIT,COND,INCR} changes.
* tree-inline.c (estimate_num_insns_1): Handle OMP_TASK.
* builtin-types.def (BT_PTR_ULONGLONG, BT_PTR_FN_VOID_PTR_PTR,
BT_FN_BOOL_ULONGLONGPTR_ULONGLONGPTR,
BT_FN_BOOL_BOOL_ULL_ULL_ULL_ULLPTR_ULLPTR,
BT_FN_BOOL_BOOL_ULL_ULL_ULL_ULL_ULLPTR_ULLPTR,
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT): New.
* omp-builtins.def (BUILT_IN_GOMP_TASK, BUILT_IN_GOMP_TASKWAIT,
BUILT_IN_GOMP_LOOP_ULL_STATIC_START,
BUILT_IN_GOMP_LOOP_ULL_DYNAMIC_START,
BUILT_IN_GOMP_LOOP_ULL_GUIDED_START,
BUILT_IN_GOMP_LOOP_ULL_RUNTIME_START,
BUILT_IN_GOMP_LOOP_ULL_ORDERED_STATIC_START,
BUILT_IN_GOMP_LOOP_ULL_ORDERED_DYNAMIC_START,
BUILT_IN_GOMP_LOOP_ULL_ORDERED_GUIDED_START,
BUILT_IN_GOMP_LOOP_ULL_ORDERED_RUNTIME_START,
BUILT_IN_GOMP_LOOP_ULL_STATIC_NEXT,
BUILT_IN_GOMP_LOOP_ULL_DYNAMIC_NEXT,
BUILT_IN_GOMP_LOOP_ULL_GUIDED_NEXT,
BUILT_IN_GOMP_LOOP_ULL_RUNTIME_NEXT,
BUILT_IN_GOMP_LOOP_ULL_ORDERED_STATIC_NEXT,
BUILT_IN_GOMP_LOOP_ULL_ORDERED_DYNAMIC_NEXT,
BUILT_IN_GOMP_LOOP_ULL_ORDERED_GUIDED_NEXT,
BUILT_IN_GOMP_LOOP_ULL_ORDERED_RUNTIME_NEXT): New builtins.
* gimplify.c (gimplify_omp_for): Allow pointer type for decl,
handle POINTER_PLUS_EXPR. If loop counter has been replaced and
original iterator is present in lastprivate clause or if
collapse > 1, set OMP_CLAUSE_LASTPRIVATE_STMT. Handle collapsed
OMP_FOR loops, adjust for OMP_FOR_{INIT,COND,INCR} changes.
(gimplify_expr): Handle OMP_SECTIONS_SWITCH and OMP_TASK.
(enum gimplify_omp_var_data): Add GOVD_PRIVATE_OUTER_REF.
(omp_notice_variable): Set GOVD_PRIVATE_OUTER_REF if needed,
if it is set, lookup var in outer contexts too. Handle
OMP_CLAUSE_DEFAULT_FIRSTPRIVATE. Handle vars that are supposed
to be implicitly determined firstprivate for task regions.
(gimplify_scan_omp_clauses): Set GOVD_PRIVATE_OUTER_REF if needed,
if it is set, lookup var in outer contexts too. Set
OMP_CLAUSE_PRIVATE_OUTER_REF if GOVD_PRIVATE_OUTER_REF is set.
Handle OMP_CLAUSE_LASTPRIVATE_STMT, OMP_CLAUSE_COLLAPSE and
OMP_CLAUSE_UNTIED. Take region_type as last argument
instead of in_parallel and in_combined_parallel.
(gimplify_omp_parallel, gimplify_omp_for, gimplify_omp_workshare):
Adjust callers.
(gimplify_adjust_omp_clauses_1): Set OMP_CLAUSE_PRIVATE_OUTER_REF if
GOVD_PRIVATE_OUTER_REF is set. Call omp_finish_clause
langhook.
(new_omp_context): Set default_kind to
OMP_CLAUSE_DEFAULT_UNSPECIFIED for OMP_TASK regions.
(omp_region_type): New enum.
(struct gimplify_omp_ctx): Remove is_parallel and is_combined_parallel
fields, add region_type.
(new_omp_context): Take region_type as argument instead of is_parallel
and is_combined_parallel.
(gimple_add_tmp_var, omp_firstprivatize_variable, omp_notice_variable,
omp_is_private, omp_check_private): Adjust ctx->is_parallel and
ctx->is_combined_parallel checks.
(gimplify_omp_task): New function.
(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_COLLAPSE and
OMP_CLAUSE_UNTIED.
* omp-low.c (extract_omp_for_data): Use schedule(static)
for schedule(auto). Handle pointer and unsigned iterators.
Compute fd->iter_type. Handle POINTER_PLUS_EXPR increments.
Add loops argument. Extract data for collapsed OMP_FOR loops.
(expand_parallel_call): Assert sched_kind isn't auto,
map runtime schedule to index 3.
(struct omp_for_data_loop): New type.
(struct omp_for_data): Remove v, n1, n2, step, cond_code fields.
Add loop, loops, collapse and iter_type fields.
(workshare_safe_to_combine_p): Disallow combined for if
iter_type is unsigned long long. Don't combine collapse > 1 loops
unless all bounds and steps are constant. Adjust extract_omp_for_data
caller.
(expand_omp_for_generic): Handle pointer, unsigned and long long
iterators. Handle collapsed OMP_FOR loops. Adjust
for struct omp_for_data changes. If libgomp function doesn't return
boolean_type_node, add comparison of the return value with 0.
(expand_omp_for_static_nochunk, expand_omp_for_static_chunk): Handle
pointer, unsigned and long long iterators. Adjust for struct
omp_for_data changes.
(expand_omp_for): Assert sched_kind isn't auto, map runtime schedule
to index 3. Use GOMP_loop_ull*{start,next} if iter_type is
unsigned long long. Allocate loops array, pass it to
extract_omp_for_data. For collapse > 1 loops use always
expand_omp_for_generic.
(omp_context): Add sfield_map and srecord_type fields.
(is_task_ctx, lookup_sfield): New functions.
(use_pointer_for_field): Use is_task_ctx helper. Change first
argument's type from const_tree to tree. Clarify comment.
In OMP_TASK disallow copy-in/out sharing.
(build_sender_ref): Call lookup_sfield instead of lookup_field.
(install_var_field): Add mask argument. Populate both record_type
and srecord_type if needed.
(delete_omp_context): Destroy sfield_map, clear DECL_ABSTRACT_ORIGIN
in srecord_type.
(fixup_child_record_type): Also remap FIELD_DECL's DECL_SIZE{,_UNIT}
and DECL_FIELD_OFFSET.
(scan_sharing_clauses): Adjust install_var_field callers. For
firstprivate clauses on explicit tasks allocate the var by value in
record_type unconditionally, rather than by reference.
Handle OMP_CLAUSE_PRIVATE_OUTER_REF. Scan OMP_CLAUSE_LASTPRIVATE_STMT.
Use is_taskreg_ctx instead of is_parallel_ctx.
Handle OMP_CLAUSE_COLLAPSE and OMP_CLAUSE_UNTIED.
(create_omp_child_function_name): Add task_copy argument, use
*_omp_cpyfn* names if it is true.
(create_omp_child_function): Add task_copy argument, if true create
*_omp_cpyfn* helper function.
(scan_omp_parallel): Adjust create_omp_child_function callers.
Rename parallel_nesting_level to taskreg_nesting_level.
(scan_omp_task): New function.
(lower_rec_input_clauses): Don't run constructors for firstprivate
explicit task vars which are initialized by *_omp_cpyfn*.
Pass outer var ref to omp_clause_default_ctor hook if
OMP_CLAUSE_PRIVATE_OUTER_REF or OMP_CLAUSE_LASTPRIVATE.
Replace OMP_CLAUSE_REDUCTION_PLACEHOLDER decls in
OMP_CLAUSE_REDUCTION_INIT.
(lower_send_clauses): Clear DECL_ABSTRACT_ORIGIN if in task to
avoid duplicate setting of fields. Handle
OMP_CLAUSE_PRIVATE_OUTER_REF.
(lower_send_shared_vars): Use srecord_type if non-NULL. Don't
copy-out if TREE_READONLY, only copy-in.
(expand_task_copyfn): New function.
(expand_task_call): New function.
(struct omp_taskcopy_context): New type.
(task_copyfn_copy_decl, task_copyfn_remap_type, create_task_copyfn):
New functions.
(lower_omp_parallel): Rename to...
(lower_omp_taskreg): ... this. Use OMP_TASKREG_* macros where needed.
Call create_task_copyfn if srecord_type is needed. Adjust
sender_decl type.
(task_shared_vars): New variable.
(check_omp_nesting_restrictions): Warn if work-sharing,
barrier, master or ordered region is closely nested inside OMP_TASK.
Add warnings for barrier if closely nested inside of work-sharing,
ordered, or master region.
(scan_omp_1): Call check_omp_nesting_restrictions even for
GOMP_barrier calls. Rename parallel_nesting_level to
taskreg_nesting_level. Handle OMP_TASK.
(lower_lastprivate_clauses): Even if some lastprivate is found on a
work-sharing construct, continue looking for them on parent parallel
construct.
(lower_omp_for_lastprivate): Add lastprivate clauses
to the beginning of dlist rather than end. Adjust for struct
omp_for_data changes.
(lower_omp_for): Add rec input clauses before OMP_FOR_PRE_BODY,
not after it. Handle collapsed OMP_FOR loops, adjust for
OMP_FOR_{INIT,COND,INCR} changes, adjust extract_omp_for_data
caller.
(get_ws_args_for): Adjust extract_omp_for_data caller.
(scan_omp_for): Handle collapsed OMP_FOR
loops, adjust for OMP_FOR_{INIT,COND,INCR} changes.
(lower_omp_single_simple): If libgomp function doesn't return
boolean_type_node, add comparison of the return value with 0.
(diagnose_sb_1, diagnose_sb_2): Handle collapsed OMP_FOR
loops, adjust for OMP_FOR_{INIT,COND,INCR} changes. Handle OMP_TASK.
(parallel_nesting_level): Rename to...
(taskreg_nesting_level): ... this.
(is_taskreg_ctx): New function.
(build_outer_var_ref, omp_copy_decl): Use is_taskreg_ctx instead
of is_parallel_ctx.
(execute_lower_omp): Rename parallel_nesting_level to
taskreg_nesting_level.
(expand_omp_parallel): Rename to...
(expand_omp_taskreg): ... this. Use OMP_TASKREG_* macros where needed.
Call omp_task_call for OMP_TASK regions.
(expand_omp): Adjust caller, handle OMP_TASK.
(lower_omp_1): Adjust lower_omp_taskreg caller, handle OMP_TASK.
* bitmap.c (bitmap_default_obstack_depth): New variable.
(bitmap_obstack_initialize, bitmap_obstack_release): Do nothing
if argument is NULL and bitmap_default_obstack is already initialized.
* ipa-struct-reorg.c (do_reorg_1): Call bitmap_obstack_release
at the end.
* matrix-reorg.c (matrix_reorg): Likewise.
cp/
* cp-tree.h (cxx_omp_finish_clause, cxx_omp_create_clause_info,
dependent_omp_for_p, begin_omp_task, finish_omp_task,
finish_omp_taskwait): New prototypes.
(cxx_omp_clause_default_ctor): Add outer argument.
(finish_omp_for): Add new clauses argument.
* cp-gimplify.c (cxx_omp_finish_clause): New function.
(cxx_omp_predetermined_sharing): Moved from semantics.c, rewritten.
(cxx_omp_clause_default_ctor): Add outer argument.
(cp_genericize_r): Walk OMP_CLAUSE_LASTPRIVATE_STMT.
* cp-objcp-common.h (LANG_HOOKS_OMP_FINISH_CLAUSE): Define.
* parser.c (cp_parser_omp_for_loop): Parse collapsed for loops.
Add par_clauses argument. If decl is present in parallel's
lastprivate clause, change that clause to shared and add
a lastprivate clause for decl to OMP_FOR_CLAUSES.
Fix wording of error messages. Adjust finish_omp_for caller.
Add clauses argument. Parse loops with random access iterators.
(cp_parser_omp_clause_collapse, cp_parser_omp_clause_untied): New
functions.
(cp_parser_omp_for, cp_parser_omp_parallel): Adjust
cp_parser_omp_for_loop callers.
(cp_parser_omp_for_cond, cp_parser_omp_for_incr): New helper
functions.
(cp_parser_omp_clause_name): Handle collapse and untied
clauses.
(cp_parser_omp_clause_schedule): Handle auto schedule.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_COLLAPSE
and PRAGMA_OMP_CLAUSE_UNTIED.
(OMP_FOR_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_COLLAPSE.
(OMP_TASK_CLAUSE_MASK): Define.
(cp_parser_omp_task, cp_parser_omp_taskwait): New functions.
(cp_parser_omp_construct): Handle PRAGMA_OMP_TASK.
(cp_parser_pragma): Handle PRAGMA_OMP_TASK and
PRAGMA_OMP_TASKWAIT.
* pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_COLLAPSE and
OMP_CLAUSE_UNTIED. Handle OMP_CLAUSE_LASTPRIVATE_STMT.
(tsubst_omp_for_iterator): New function.
(dependent_omp_for_p): New function.
(tsubst_expr) <case OMP_FOR>: Use it. Handle collapsed OMP_FOR
loops. Adjust finish_omp_for caller. Handle loops with random
access iterators. Adjust for OMP_FOR_{INIT,COND,INCR} changes.
(tsubst_expr): Handle OMP_TASK.
* semantics.c (cxx_omp_create_clause_info): New function.
(finish_omp_clauses): Call it. Handle OMP_CLAUSE_UNTIED and
OMP_CLAUSE_COLLAPSE.
(cxx_omp_predetermined_sharing): Removed.
* semantics.c (finish_omp_for): Allow pointer iterators. Use
handle_omp_for_class_iterator and dependent_omp_for_p. Handle
collapsed for loops. Adjust c_finish_omp_for caller. Add new
clauses argument. Fix check for type dependent cond or incr.
Set OMP_FOR_CLAUSES to clauses. Use cp_convert instead of
fold_convert to convert incr amount to difference_type. Only
fold if not in template. If decl is mentioned in lastprivate
clause, set OMP_CLAUSE_LASTPRIVATE_STMT. Handle loops with random
access iterators. Adjust for OMP_FOR_{INIT,COND,INCR}
changes.
(finish_omp_threadprivate): Allow static class members of the
current class.
(handle_omp_for_class_iterator, begin_omp_task, finish_omp_task,
finish_omp_taskwait): New functions.
* parser.c (cp_parser_binary_expression): Add prec argument.
(cp_parser_assignment_expression): Adjust caller.
* cp-tree.h (outer_curly_brace_block): New prototype.
* decl.c (outer_curly_brace_block): No longer static.
fortran/
* scanner.c (skip_free_comments, skip_fixed_comments): Handle tabs.
* parse.c (next_free): Allow tab after !$omp.
(decode_omp_directive): Handle !$omp task, !$omp taskwait
and !$omp end task.
(case_executable): Add ST_OMP_TASKWAIT.
(case_exec_markers): Add ST_OMP_TASK.
(gfc_ascii_statement): Handle ST_OMP_TASK, ST_OMP_END_TASK and
ST_OMP_TASKWAIT.
(parse_omp_structured_block, parse_executable): Handle ST_OMP_TASK.
* gfortran.h (gfc_find_sym_in_expr): New prototype.
(gfc_statement): Add ST_OMP_TASK, ST_OMP_END_TASK and ST_OMP_TASKWAIT.
(gfc_omp_clauses): Add OMP_SCHED_AUTO to sched_kind,
OMP_DEFAULT_FIRSTPRIVATE to default_sharing. Add collapse and
untied fields.
(gfc_exec_op): Add EXEC_OMP_TASK and EXEC_OMP_TASKWAIT.
* f95-lang.c (LANG_HOOKS_OMP_CLAUSE_COPY_CTOR,
LANG_HOOKS_OMP_CLAUSE_ASSIGN_OP, LANG_HOOKS_OMP_CLAUSE_DTOR,
LANG_HOOKS_OMP_PRIVATE_OUTER_REF): Define.
* trans.h (gfc_omp_clause_default_ctor): Add another argument.
(gfc_omp_clause_copy_ctor, gfc_omp_clause_assign_op,
gfc_omp_clause_dtor, gfc_omp_private_outer_ref): New prototypes.
* types.def (BT_ULONGLONG, BT_PTR_ULONGLONG,
BT_FN_BOOL_ULONGLONGPTR_ULONGLONGPTR,
BT_FN_BOOL_BOOL_ULL_ULL_ULL_ULLPTR_ULLPTR,
BT_FN_BOOL_BOOL_ULL_ULL_ULL_ULL_ULLPTR_ULLPTR,
BT_FN_VOID_PTR_PTR, BT_PTR_FN_VOID_PTR_PTR,
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT): New.
(BT_BOOL): Use integer type with BOOL_TYPE_SIZE rather
than boolean_type_node.
* dump-parse-tree.c (gfc_show_omp_node): Handle EXEC_OMP_TASK,
EXEC_OMP_TASKWAIT, OMP_SCHED_AUTO, OMP_DEFAULT_FIRSTPRIVATE,
untied and collapse clauses.
(gfc_show_code_node): Handle EXEC_OMP_TASK and EXEC_OMP_TASKWAIT.
* trans.c (gfc_trans_code): Handle EXEC_OMP_TASK and
EXEC_OMP_TASKWAIT.
* st.c (gfc_free_statement): Likewise.
* resolve.c (gfc_resolve_blocks, resolve_code): Likewise.
(find_sym_in_expr): Rename to...
(gfc_find_sym_in_expr): ... this. No longer static.
(resolve_allocate_expr, resolve_ordinary_assign): Adjust caller.
* match.h (gfc_match_omp_task, gfc_match_omp_taskwait): New
prototypes.
* openmp.c (resolve_omp_clauses): Allow allocatable arrays in
firstprivate, lastprivate, reduction, copyprivate and copyin
clauses.
(omp_current_do_code): Made static.
(omp_current_do_collapse): New variable.
(gfc_resolve_omp_do_blocks): Compute omp_current_do_collapse,
clear omp_current_do_code and omp_current_do_collapse on return.
(gfc_resolve_do_iterator): Handle collapsed do loops.
(resolve_omp_do): Likewise, diagnose errorneous collapsed do loops.
(OMP_CLAUSE_COLLAPSE, OMP_CLAUSE_UNTIED): Define.
(gfc_match_omp_clauses): Handle default (firstprivate),
schedule (auto), untied and collapse (n) clauses.
(OMP_DO_CLAUSES): Add OMP_CLAUSE_COLLAPSE.
(OMP_TASK_CLAUSES): Define.
(gfc_match_omp_task, gfc_match_omp_taskwait): New functions.
* trans-openmp.c (gfc_omp_private_outer_ref): New function.
(gfc_omp_clause_default_ctor): Add outer argument. For allocatable
arrays allocate them with the bounds of the outer var if outer
var is allocated.
(gfc_omp_clause_copy_ctor, gfc_omp_clause_assign_op,
gfc_omp_clause_dtor): New functions.
(gfc_trans_omp_array_reduction): If decl is allocatable array,
allocate it with outer var's bounds in OMP_CLAUSE_REDUCTION_INIT
and deallocate it in OMP_CLAUSE_REDUCTION_MERGE.
(gfc_omp_predetermined_sharing): Return OMP_CLAUSE_DEFAULT_SHARED
for assumed-size arrays.
(gfc_trans_omp_do): Add par_clauses argument. If dovar is
present in lastprivate clause and do loop isn't simple,
set OMP_CLAUSE_LASTPRIVATE_STMT. If dovar is present in
parallel's lastprivate clause, change it to shared and add
lastprivate clause to OMP_FOR_CLAUSES. Handle collapsed do loops.
(gfc_trans_omp_directive): Adjust gfc_trans_omp_do callers.
(gfc_trans_omp_parallel_do): Likewise. Move collapse clause to
OMP_FOR from OMP_PARALLEL.
(gfc_trans_omp_clauses): Handle OMP_SCHED_AUTO,
OMP_DEFAULT_FIRSTPRIVATE, untied and collapse clauses.
(gfc_trans_omp_task, gfc_trans_omp_taskwait): New functions.
(gfc_trans_omp_directive): Handle EXEC_OMP_TASK and
EXEC_OMP_TASKWAIT.
gcc/testsuite/
* gcc.dg/gomp/collapse-1.c: New test.
* gcc.dg/gomp/nesting-1.c: New test.
* g++.dg/gomp/task-1.C: New test.
* g++.dg/gomp/predetermined-1.C: New test.
* g++.dg/gomp/tls-4.C: New test.
* gfortran.dg/gomp/collapse1.f90: New test.
* gfortran.dg/gomp/sharing-3.f90: New test.
* gcc.dg/gomp/pr27499.c (foo): Remove is unsigned dg-warning.
* g++.dg/gomp/pr27499.C (foo): Likewise.
* g++.dg/gomp/for-16.C (foo): Likewise.
* g++.dg/gomp/tls-3.C: Remove dg-error, add S::s definition.
* g++.dg/gomp/pr34607.C: Adjust dg-error location.
* g++.dg/gomp/for-16.C (foo): Add a new dg-error.
* gcc.dg/gomp/appendix-a/a.35.4.c: Add dg-warning.
* gcc.dg/gomp/appendix-a/a.35.6.c: Likewise.
* gfortran.dg/gomp/appendix-a/a.35.4.f90: Likewise.
* gfortran.dg/gomp/appendix-a/a.35.6.f90: Likewise.
* gfortran.dg/gomp/omp_parse1.f90: Remove !$omp tab test.
* gfortran.dg/gomp/appendix-a/a.33.4.f90: Remove dg-error
about allocatable array.
* gfortran.dg/gomp/reduction1.f90: Likewise.
libgomp/
* configure.ac (LIBGOMP_GNU_SYMBOL_VERSIONING): New AC_DEFINE.
Substitute also OMP_*LOCK_25*.
* configure: Regenerated.
* config.h.in: Regenerated.
* Makefile.am (libgomp_la_SOURCES): Add loop_ull.c, iter_ull.c,
ptrlock.c and task.c.
* Makefile.in: Regenerated.
* testsuite/Makefile.in: Regenerated.
* task.c: New file.
* loop_ull.c: New file.
* iter_ull.c: New file.
* libgomp.h: Include ptrlock.h.
(enum gomp_task_kind): New type.
(struct gomp_team): Add task_lock, task_queue, task_count,
task_running_count, single_count fields. Add
work_share_list_free_lock ifndef HAVE_SYNC_BUILTINS.
Remove work_share_lock, generation_mask,
oldest_live_gen, num_live_gen and init_work_shares fields, add
work work_share_list_alloc, work_share_list_free and work_share_chunk
fields. Change work_shares from pointer to pointers into an array.
Change ordered_release field into gomp_sem_t ** from flexible array
member. Add implicit_task and initial_work_shares fields.
Move close to the end of the struct.
(struct gomp_team_state): Add single_count, last_work_share,
active_level and level fields, remove work_share_generation.
(gomp_barrier_handle_tasks): New prototype.
(gomp_finish_task): New inline function.
(struct gomp_work_share): Move chunk_size, end, incr into
transparent union/struct, add chunk_size_ull, end_ll, incr_ll and
next_ll fields. Reshuffle fields. Add next_alloc,
next_ws, next_free and inline_ordered_team_ids fields, change
ordered_team_ids into pointer from flexible array member.
Add mode field. Put lock and next into a different cache line
from most of the write-once fields.
(gomp_iter_ull_static_next, gomp_iter_ull_dynamic_next_locked,
gomp_iter_ull_guided_next_locked, gomp_iter_ull_dynamic_next,
gomp_iter_ull_guided_next): New prototypes.
(gomp_new_icv): New prototype.
(struct gomp_thread): Add thread_pool and task fields.
(struct gomp_thread_pool): New type.
(gomp_new_team): New prototype.
(gomp_team_start): Change type of last argument.
(gomp_new_work_share): Removed.
(gomp_init_work_share, gomp_fini_work_share): New prototypes.
(gomp_work_share_init_done): New static inline.
(gomp_throttled_spin_count_var, gomp_available_cpus,
gomp_managed_threads): New extern decls.
(gomp_init_task): New prototype.
(gomp_spin_count_var): New extern var decl.
(LIBGOMP_GNU_SYMBOL_VERSIONING): Undef if no visibility
or no alias support, or if not PIC.
(gomp_init_lock_30, gomp_destroy_lock_30, gomp_set_lock_30,
gomp_unset_lock_30, gomp_test_lock_30, gomp_init_nest_lock_30,
gomp_destroy_nest_lock_30, gomp_set_nest_lock_30,
gomp_unset_nest_lock_30, gomp_test_nest_lock_30, gomp_init_lock_25,
gomp_destroy_lock_25, gomp_set_lock_25, gomp_unset_lock_25,
gomp_test_lock_25, gomp_init_nest_lock_25, gomp_destroy_nest_lock_25,
gomp_set_nest_lock_25, gomp_unset_nest_lock_25,
gomp_test_nest_lock_25): New prototypes.
(omp_lock_symver, strong_alias): Define.
(gomp_remaining_threads_count, gomp_remaining_threads_lock): New
decls.
(gomp_end_task): New.
(struct gomp_task_icv, gomp_global_icv): New.
(gomp_thread_limit_var, gomp_max_active_levels_var): New.
(struct gomp_task): New.
(gomp_nthreads_var, gomp_dyn_var, gomp_nest_var,
gomp_run_sched_var, gomp_run_sched_chunk): Remove.
(gomp_icv): New.
(gomp_schedule_type): Reorder enum to match
omp_sched_t.
* team.c (struct gomp_thread_start_data): Add thread_pool and task
fields.
(gomp_thread_start): Add gomp_team_barrier_wait call.
For non-nested case remove clearing of docked thread thr fields.
Use pool fields instead of global gomp_* variables. Use
gomp_barrier_wait_last when needed. Initialize ts.active_level.
Create tasks for each member thread.
(free_team): Only destroy team barrier, task_lock here and free it.
(gomp_free_thread): Free last_team if non-NULL.
(gomp_team_end): Call gomp_team_barrier_wait instead of
gomp_barrier_wait. For nested case call one extra
gomp_barrier_wait. Move here some destruction from free_team.
Call free_team on pool->last_team if any, rather than freeing
current team. Destroy work_share_list_free_lock ifndef
HAVE_SYNC_BUILTINS.
(gomp_new_icv): New function.
(gomp_threads, gomp_threads_size, gomp_threads_used,
gomp_threads_dock): Removed.
(gomp_thread_destructor): New variable.
(gomp_new_thread_pool, gomp_free_pool_helper, gomp_free_thread): New
functions.
(gomp_team_start): Create new pool if current thread doesn't have
one. Use pool fields instead of global gomp_* variables.
Initialize thread_pool field for new threads. Clear single_count.
Change last argument from ws to team, don't create
new team, set ts.work_share to &team->work_shares[0] and clear
ts.last_work_share. Don't clear ts.work_share_generation.
If number of threads changed, adjust atomically gomp_managed_threads.
Use gomp_init_task instead of gomp_new_task,
set thr->task to the corresponding implicit_task array entry.
Create tasks for each member thread. Initialize ts.level.
(initialize_team): Call pthread_key_create on
gomp_thread_destructor.
(team_destructor): New function.
(new_team): Removed.
(gomp_new_team): New function.
(free_team): Free gomp_work_share blocks chained through next_alloc,
instead of freeing work_shares and destroying work_share_lock.
(gomp_team_end): Call gomp_fini_work_share. If number of threads
changed, adjust atomically gomp_managed_threads. Use gomp_end_task.
* barrier.c (GOMP_barrier): Call gomp_team_barrier_wait instead
of gomp_barrier_wait.
* single.c (GOMP_single_copy_start): Call gomp_team_barrier_wait
instead of gomp_barrier_wait. Call gomp_work_share_init_done
if gomp_work_share_start returned true. Don't unlock ws->lock.
(GOMP_single_copy_end): Call gomp_team_barrier_wait instead
of gomp_barrier_wait.
(GOMP_single_start): Rewritten if HAVE_SYNC_BUILTINS. Call
gomp_work_share_init_done if gomp_work_share_start returned true.
Don't unlock ws->lock.
* work.c: Include stddef.h.
(free_work_share): Use work_share_list_free_lock instead
of atomic chaining ifndef HAVE_SYNC_BUILTINS. Add team argument.
Call gomp_fini_work_share and then either free ws if orphaned, or
put it into work_share_list_free list of the current team.
(alloc_work_share, gomp_init_work_share, gomp_fini_work_share): New
functions.
(gomp_work_share_start, gomp_work_share_end,
gomp_work_share_end_nowait): Rewritten.
* omp_lib.f90.in Change some tabs to spaces to prevent warnings.
(openmp_version): Set to 200805.
(omp_sched_kind, omp_sched_static, omp_sched_dynamic,
omp_sched_guided, omp_sched_auto): New parameters.
(omp_set_schedule, omp_get_schedule, omp_get_thread_limit,
omp_set_max_active_levels, omp_get_max_active_levels,
omp_get_level, omp_get_ancestor_thread_num, omp_get_team_size,
omp_get_active_level): New interfaces.
* omp_lib.h.in (openmp_version): Set to 200805.
(omp_sched_kind, omp_sched_static, omp_sched_dynamic,
omp_sched_guided, omp_sched_auto): New parameters.
(omp_set_schedule, omp_get_schedule, omp_get_thread_limit,
omp_set_max_active_levels, omp_get_max_active_levels,
omp_get_level, omp_get_ancestor_thread_num, omp_get_team_size,
omp_get_active_level): New externals.
* loop.c: Include limits.h.
(GOMP_loop_runtime_next, GOMP_loop_ordered_runtime_next): Handle
GFS_AUTO.
(GOMP_loop_runtime_start, GOMP_loop_ordered_runtime_start):
Likewise. Use gomp_icv.
(gomp_loop_static_start, gomp_loop_dynamic_start): Clear
ts.static_trip here.
(gomp_loop_static_start, gomp_loop_ordered_static_start): Call
gomp_work_share_init_done after gomp_loop_init. Don't unlock ws->lock.
(gomp_loop_dynamic_start, gomp_loop_guided_start): Call
gomp_work_share_init_done after gomp_loop_init. If HAVE_SYNC_BUILTINS,
don't unlock ws->lock, otherwise lock it.
(gomp_loop_ordered_dynamic_start, gomp_loop_ordered_guided_start): Call
gomp_work_share_init_done after gomp_loop_init. Lock ws->lock.
(gomp_parallel_loop_start): Call gomp_new_team instead of
gomp_new_work_share. Call gomp_loop_init on &team->work_shares[0].
Adjust gomp_team_start caller. Pass 0 as second argument to
gomp_resolve_num_threads.
(gomp_loop_init): For GFS_DYNAMIC, multiply ws->chunk_size by incr.
If adding ws->chunk_size nthreads + 1 times after end won't
overflow, set ws->mode to 1.
* libgomp_g.h (GOMP_loop_ull_static_start, GOMP_loop_ull_dynamic_start,
GOMP_loop_ull_guided_start, GOMP_loop_ull_runtime_start,
GOMP_loop_ull_ordered_static_start,
GOMP_loop_ull_ordered_dynamic_start,
GOMP_loop_ull_ordered_guided_start,
GOMP_loop_ull_ordered_runtime_start, GOMP_loop_ull_static_next,
GOMP_loop_ull_dynamic_next, GOMP_loop_ull_guided_next,
GOMP_loop_ull_runtime_next, GOMP_loop_ull_ordered_static_next,
GOMP_loop_ull_ordered_dynamic_next, GOMP_loop_ull_ordered_guided_next,
GOMP_loop_ull_ordered_runtime_next, GOMP_task, GOMP_taskwait): New
prototypes.
* libgomp.map: Export lock routines also @@OMP_2.0.
(GOMP_loop_ordered_dynamic_first,
GOMP_loop_ordered_guided_first, GOMP_loop_ordered_runtime_first,
GOMP_loop_ordered_static_first): Remove.
(GOMP_loop_ull_dynamic_next, GOMP_loop_ull_dynamic_start,
GOMP_loop_ull_guided_next, GOMP_loop_ull_guided_start,
GOMP_loop_ull_ordered_dynamic_next,
GOMP_loop_ull_ordered_dynamic_start,
GOMP_loop_ull_ordered_guided_next,
GOMP_loop_ull_ordered_guided_start,
GOMP_loop_ull_ordered_runtime_next,
GOMP_loop_ull_ordered_runtime_start,
GOMP_loop_ull_ordered_static_next,
GOMP_loop_ull_ordered_static_start,
GOMP_loop_ull_runtime_next, GOMP_loop_ull_runtime_start,
GOMP_loop_ull_static_next, GOMP_loop_ull_static_start,
GOMP_task, GOMP_taskwait): Export @@GOMP_2.0.
(omp_set_schedule, omp_get_schedule,
omp_get_thread_limit, omp_set_max_active_levels,
omp_get_max_active_levels, omp_get_level,
omp_get_ancestor_thread_num, omp_get_team_size, omp_get_active_level,
omp_set_schedule_, omp_set_schedule_8_,
omp_get_schedule_, omp_get_schedule_8_, omp_get_thread_limit_,
omp_set_max_active_levels_, omp_set_max_active_levels_8_,
omp_get_max_active_levels_, omp_get_level_,
omp_get_ancestor_thread_num_, omp_get_ancestor_thread_num_8_,
omp_get_team_size_, omp_get_team_size_8_, omp_get_active_level_):
New exports @@OMP_3.0.
* omp.h.in (omp_sched_t): New type.
(omp_set_schedule, omp_get_schedule, omp_get_thread_limit,
omp_set_max_active_levels, omp_get_max_active_levels,
omp_get_level, omp_get_ancestor_thread_num, omp_get_team_size,
omp_get_active_level): New prototypes.
* env.c (gomp_spin_count_var, gomp_throttled_spin_count_var,
gomp_available_cpus, gomp_managed_threads, gomp_max_active_levels_var,
gomp_thread_limit_var, gomp_remaining_threads_count,
gomp_remaining_threads_lock): New variables.
(parse_spincount): New function.
(initialize_env): Call gomp_init_num_threads unconditionally.
Initialize gomp_available_cpus. Call parse_spincount,
initialize gomp_{,throttled_}spin_count_var
depending on presence and value of OMP_WAIT_POLICY and
GOMP_SPINCOUNT env vars. Handle GOMP_BLOCKTIME env var.
Handle OMP_WAIT_POLICY, OMP_MAX_ACTIVE_LEVELS,
OMP_THREAD_LIMIT, OMP_STACKSIZE env vars. Handle unit specification
for GOMP_STACKSIZE. Initialize gomp_remaining_threads_count and
gomp_remaining_threads_lock if needed. Use gomp_global_icv.
(gomp_nthreads_var, gomp_dyn_var, gomp_nest_var,
gomp_run_sched_var, gomp_run_sched_chunk): Remove.
(gomp_global_icv): New.
(parse_schedule): Use it. Parse "auto".
(omp_set_num_threads): Use gomp_icv.
(omp_set_dynamic, omp_get_dynamic, omp_set_nested, omp_get_nested):
Likewise.
(omp_get_max_threads): Move from parallel.c.
(omp_set_schedule, omp_get_schedule, omp_get_thread_limit,
omp_set_max_active_levels, omp_get_max_active_levels): New functions,
add ialias.
(parse_stacksize, parse_wait_policy): New functions.
* fortran.c: Rewrite lock wrappers, if symbol versioning provide
both wrappers for compatibility and new locks.
(omp_set_schedule, omp_get_schedule,
omp_get_thread_limit, omp_set_max_active_levels,
omp_get_max_active_levels, omp_get_level,
omp_get_ancestor_thread_num, omp_get_team_size,
omp_get_active_level): New ialias_redirect.
(omp_set_schedule_, omp_set_schedule_8_,
omp_get_schedule_, omp_get_schedule_8_, omp_get_thread_limit_,
omp_set_max_active_levels_, omp_set_max_active_levels_8_,
omp_get_max_active_levels_, omp_get_level_,
omp_get_ancestor_thread_num_, omp_get_ancestor_thread_num_8_,
omp_get_team_size_, omp_get_team_size_8_, omp_get_active_level_):
New functions.
* parallel.c: Include limits.h.
(gomp_resolve_num_threads): Add count argument. Rewritten.
(GOMP_parallel_start): Call gomp_new_team and pass that as last
argument to gomp_team_start. Pass 0 as second argument to
gomp_resolve_num_threads.
(GOMP_parallel_end): Decrease gomp_remaining_threads_count
if gomp_thread_limit_var != ULONG_MAX.
(omp_in_parallel): Implement using ts.active_level.
(omp_get_max_threads): Move to env.c.
(omp_get_level, omp_get_ancestor_thread_num,
omp_get_team_size, omp_get_active_level): New functions,
add ialias.
* sections.c (GOMP_sections_start): Call gomp_work_share_init_done
after gomp_sections_init. If HAVE_SYNC_BUILTINS, call
gomp_iter_dynamic_next instead of the _locked variant and don't take
lock around it, otherwise acquire it before calling
gomp_iter_dynamic_next_locked.
(GOMP_sections_next): If HAVE_SYNC_BUILTINS, call
gomp_iter_dynamic_next instead of the _locked variant and don't take
lock around it.
(GOMP_parallel_sections_start): Call gomp_new_team instead of
gomp_new_work_share. Call gomp_sections_init on &team->work_shares[0].
Adjust gomp_team_start caller. Pass count as second argument to
gomp_resolve_num_threads, don't adjust num_threads after the call.
Use gomp_icv.
* iter.c (gomp_iter_dynamic_next_locked): Don't multiply
ws->chunk_size by incr.
(gomp_iter_dynamic_next): Likewise. If ws->mode, use more efficient
code.
* libgomp_f.h.in (omp_lock_25_arg_t, omp_nest_lock_25_arg_t): New
types.
(omp_lock_25_arg, omp_nest_lock_25_arg): New macros.
(omp_check_defines): Check even the compat defines.
* config/linux/ptrlock.c: New file.
* config/linux/ptrlock.h: New file.
* config/linux/wait.h: New file.
* config/posix/ptrlock.c: New file.
* config/posix/ptrlock.h: New file.
* config/linux/bar.h (gomp_team_barrier_wait,
gomp_team_barrier_wait_end, gomp_team_barrier_wake): New prototypes.
(gomp_team_barrier_set_task_pending,
gomp_team_barrier_clear_task_pending,
gomp_team_barrier_set_waiting_for_tasks,
gomp_team_barrier_waiting_for_tasks,
gomp_team_barrier_done): New inlines.
(gomp_barrier_t): Rewritten.
(gomp_barrier_state_t): New typedef.
(gomp_barrier_init, gomp_barrier_reinit, gomp_barrier_destroy,
gomp_barrier_wait_start): Rewritten.
(gomp_barrier_wait_end): Change second argument to
gomp_barrier_state_t.
(gomp_barrier_last_thread, gomp_barrier_wait_last): New static
inlines.
* config/linux/bar.c: Include wait.h instead of libgomp.h and
futex.h.
(gomp_barrier_wait_end): Rewritten.
(gomp_team_barrier_wait, gomp_team_barrier_wait_end,
gomp_team_barrier_wake, gomp_barrier_wait_last): New functions.
* config/posix/bar.h (gomp_barrier_t): Add generation field.
(gomp_barrier_state_t): New typedef.
(gomp_team_barrier_wait,
gomp_team_barrier_wait_end, gomp_team_barrier_wake): New prototypes.
(gomp_barrier_wait_start): Or all but low 2 bits from generation
into the return value. Return gomp_barrier_state_t.
(gomp_team_barrier_set_task_pending,
gomp_team_barrier_clear_task_pending,
gomp_team_barrier_set_waiting_for_tasks,
gomp_team_barrier_waiting_for_tasks,
gomp_team_barrier_done): New inlines.
(gomp_barrier_wait_end): Change second argument to
gomp_barrier_state_t.
(gomp_barrier_last_thread, gomp_barrier_wait_last): New static
inlines.
* config/posix/bar.c (gomp_barrier_init): Clear generation field.
(gomp_barrier_wait_end): Change second argument to
gomp_barrier_state_t.
(gomp_team_barrier_wait, gomp_team_barrier_wait_end,
gomp_team_barrier_wake): New functions.
* config/linux/mutex.c: Include wait.h instead of libgomp.h and
futex.h.
(gomp_futex_wake, gomp_futex_wait): New variables.
(gomp_mutex_lock_slow): Call do_wait instead of futex_wait.
* config/linux/lock.c: Rewrite to make locks task owned,
for backwards compatibility provide the old entrypoints
if symbol versioning. Include wait.h instead of libgomp.h and
futex.h.
(gomp_set_nest_lock_25): Call do_wait instead of futex_wait.
* config/posix95/lock.c: Rewrite to make locks task owned,
for backwards compatibility provide the old entrypoints
if symbol versioning.
* config/posix/lock.c: Rewrite to make locks task owned,
for backwards compatibility provide the old entrypoints
if symbol versioning.
* config/linux/proc.c (gomp_init_num_threads): Use gomp_global_icv.
(get_num_procs, gomp_dynamic_max_threads): Use gomp_icv.
* config/posix/proc.c, config/mingw32/proc.c: Similarly.
* config/linux/powerpc/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove.
(sys_futex0): Return error code.
(futex_wake, futex_wait): If ENOSYS was returned, clear
FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry.
(cpu_relax, atomic_write_barrier): New static inlines.
* config/linux/alpha/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove.
(futex_wake, futex_wait): If ENOSYS was returned, clear
FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry.
(cpu_relax, atomic_write_barrier): New static inlines.
* config/linux/x86/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove.
(sys_futex0): Return error code.
(futex_wake, futex_wait): If ENOSYS was returned, clear
FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry.
(cpu_relax, atomic_write_barrier): New static inlines.
* config/linux/s390/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove.
(sys_futex0): Return error code.
(futex_wake, futex_wait): If ENOSYS was returned, clear
FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry.
(cpu_relax, atomic_write_barrier): New static inlines.
* config/linux/ia64/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove.
(sys_futex0): Return error code.
(futex_wake, futex_wait): If ENOSYS was returned, clear
FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry.
(cpu_relax, atomic_write_barrier): New static inlines.
* config/linux/sparc/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove.
(sys_futex0): Return error code.
(futex_wake, futex_wait): If ENOSYS was returned, clear
FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry.
(cpu_relax, atomic_write_barrier): New static inlines.
* config/linux/sem.c: Include wait.h instead of libgomp.h and
futex.h.
(gomp_sem_wait_slow): Call do_wait instead of futex_wait.
* config/linux/affinity.c: Assume HAVE_SYNC_BUILTINS.
* config/linux/omp-lock.h (omp_lock_25_t, omp_nest_lock_25_t): New
types.
(omp_nest_lock_t): Change owner into void *, add lock field.
* config/posix95/omp-lock.h: Include semaphore.h.
(omp_lock_25_t, omp_nest_lock_25_t): New types.
(omp_lock_t): Use sem_t instead of mutex if semaphores
aren't broken.
(omp_nest_lock_t): Likewise. Change owner to void *.
* config/posix/omp-lock.h: Include semaphore.h.
(omp_lock_25_t, omp_nest_lock_25_t): New types.
(omp_lock_t): Use sem_t instead of mutex if semaphores
aren't broken.
(omp_nest_lock_t): Likewise. Add owner field.
* testsuite/libgomp.c/collapse-1.c: New test.
* testsuite/libgomp.c/collapse-2.c: New test.
* testsuite/libgomp.c/collapse-3.c: New test.
* testsuite/libgomp.c/icv-1.c: New test.
* testsuite/libgomp.c/icv-2.c: New test.
* testsuite/libgomp.c/lib-2.c: New test.
* testsuite/libgomp.c/lock-1.c: New test.
* testsuite/libgomp.c/lock-2.c: New test.
* testsuite/libgomp.c/lock-3.c: New test.
* testsuite/libgomp.c/loop-4.c: New test.
* testsuite/libgomp.c/loop-5.c: New test.
* testsuite/libgomp.c/loop-6.c: New test.
* testsuite/libgomp.c/loop-7.c: New test.
* testsuite/libgomp.c/loop-8.c: New test.
* testsuite/libgomp.c/loop-9.c: New test.
* testsuite/libgomp.c/nested-3.c: New test.
* testsuite/libgomp.c/nestedfn-6.c: New test.
* testsuite/libgomp.c/sort-1.c: New test.
* testsuite/libgomp.c/task-1.c: New test.
* testsuite/libgomp.c/task-2.c: New test.
* testsuite/libgomp.c/task-3.c: New test.
* testsuite/libgomp.c/task-4.c: New test.
* testsuite/libgomp.c++/c++.exp: Add libstdc++-v3 build includes
to C++ testsuite default compiler options.
* testsuite/libgomp.c++/collapse-1.C: New test.
* testsuite/libgomp.c++/collapse-2.C: New test.
* testsuite/libgomp.c++/ctor-10.C: New test.
* testsuite/libgomp.c++/for-1.C: New test.
* testsuite/libgomp.c++/for-2.C: New test.
* testsuite/libgomp.c++/for-3.C: New test.
* testsuite/libgomp.c++/for-4.C: New test.
* testsuite/libgomp.c++/for-5.C: New test.
* testsuite/libgomp.c++/loop-8.C: New test.
* testsuite/libgomp.c++/loop-9.C: New test.
* testsuite/libgomp.c++/loop-10.C: New test.
* testsuite/libgomp.c++/task-1.C: New test.
* testsuite/libgomp.c++/task-2.C: New test.
* testsuite/libgomp.c++/task-3.C: New test.
* testsuite/libgomp.c++/task-4.C: New test.
* testsuite/libgomp.c++/task-5.C: New test.
* testsuite/libgomp.c++/task-6.C: New test.
* testsuite/libgomp.fortran/allocatable1.f90: New test.
* testsuite/libgomp.fortran/allocatable2.f90: New test.
* testsuite/libgomp.fortran/allocatable3.f90: New test.
* testsuite/libgomp.fortran/allocatable4.f90: New test.
* testsuite/libgomp.fortran/collapse1.f90: New test.
* testsuite/libgomp.fortran/collapse2.f90: New test.
* testsuite/libgomp.fortran/collapse3.f90: New test.
* testsuite/libgomp.fortran/collapse4.f90: New test.
* testsuite/libgomp.fortran/lastprivate1.f90: New test.
* testsuite/libgomp.fortran/lastprivate2.f90: New test.
* testsuite/libgomp.fortran/lib4.f90: New test.
* testsuite/libgomp.fortran/lock-1.f90: New test.
* testsuite/libgomp.fortran/lock-2.f90: New test.
* testsuite/libgomp.fortran/nested1.f90: New test.
* testsuite/libgomp.fortran/nestedfn4.f90: New test.
* testsuite/libgomp.fortran/strassen.f90: New test.
* testsuite/libgomp.fortran/tabs1.f90: New test.
* testsuite/libgomp.fortran/tabs2.f: New test.
* testsuite/libgomp.fortran/task1.f90: New test.
* testsuite/libgomp.fortran/task2.f90: New test.
* testsuite/libgomp.fortran/vla4.f90: Add dg-warning.
* testsuite/libgomp.fortran/vla5.f90: Likewise.
* testsuite/libgomp.c/pr26943-2.c: Likewise.
* testsuite/libgomp.c/pr26943-3.c: Likewise.
* testsuite/libgomp.c/pr26943-4.c: Likewise.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@136433 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'libgomp')
119 files changed, 12406 insertions, 602 deletions
diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog index f67375898d1..73a0aa70c0c 100644 --- a/libgomp/ChangeLog +++ b/libgomp/ChangeLog @@ -1,3 +1,470 @@ +2008-06-06 Jakub Jelinek <jakub@redhat.com> + Richard Henderson <rth@redhat.com> + Ulrich Drepper <drepper@redhat.com> + Jakob Blomer <jakob.blomer@ira.uka.de> + + * configure.ac (LIBGOMP_GNU_SYMBOL_VERSIONING): New AC_DEFINE. + Substitute also OMP_*LOCK_25*. + * configure: Regenerated. + * config.h.in: Regenerated. + * Makefile.am (libgomp_la_SOURCES): Add loop_ull.c, iter_ull.c, + ptrlock.c and task.c. + * Makefile.in: Regenerated. + * testsuite/Makefile.in: Regenerated. + * task.c: New file. + * loop_ull.c: New file. + * iter_ull.c: New file. + * libgomp.h: Include ptrlock.h. + (enum gomp_task_kind): New type. + (struct gomp_team): Add task_lock, task_queue, task_count, + task_running_count, single_count fields. Add + work_share_list_free_lock ifndef HAVE_SYNC_BUILTINS. + Remove work_share_lock, generation_mask, + oldest_live_gen, num_live_gen and init_work_shares fields, add + work work_share_list_alloc, work_share_list_free and work_share_chunk + fields. Change work_shares from pointer to pointers into an array. + Change ordered_release field into gomp_sem_t ** from flexible array + member. Add implicit_task and initial_work_shares fields. + Move close to the end of the struct. + (struct gomp_team_state): Add single_count, last_work_share, + active_level and level fields, remove work_share_generation. + (gomp_barrier_handle_tasks): New prototype. + (gomp_finish_task): New inline function. + (struct gomp_work_share): Move chunk_size, end, incr into + transparent union/struct, add chunk_size_ull, end_ll, incr_ll and + next_ll fields. Reshuffle fields. Add next_alloc, + next_ws, next_free and inline_ordered_team_ids fields, change + ordered_team_ids into pointer from flexible array member. + Add mode field. Put lock and next into a different cache line + from most of the write-once fields. + (gomp_iter_ull_static_next, gomp_iter_ull_dynamic_next_locked, + gomp_iter_ull_guided_next_locked, gomp_iter_ull_dynamic_next, + gomp_iter_ull_guided_next): New prototypes. + (gomp_new_icv): New prototype. + (struct gomp_thread): Add thread_pool and task fields. + (struct gomp_thread_pool): New type. + (gomp_new_team): New prototype. + (gomp_team_start): Change type of last argument. + (gomp_new_work_share): Removed. + (gomp_init_work_share, gomp_fini_work_share): New prototypes. + (gomp_work_share_init_done): New static inline. + (gomp_throttled_spin_count_var, gomp_available_cpus, + gomp_managed_threads): New extern decls. + (gomp_init_task): New prototype. + (gomp_spin_count_var): New extern var decl. + (LIBGOMP_GNU_SYMBOL_VERSIONING): Undef if no visibility + or no alias support, or if not PIC. + (gomp_init_lock_30, gomp_destroy_lock_30, gomp_set_lock_30, + gomp_unset_lock_30, gomp_test_lock_30, gomp_init_nest_lock_30, + gomp_destroy_nest_lock_30, gomp_set_nest_lock_30, + gomp_unset_nest_lock_30, gomp_test_nest_lock_30, gomp_init_lock_25, + gomp_destroy_lock_25, gomp_set_lock_25, gomp_unset_lock_25, + gomp_test_lock_25, gomp_init_nest_lock_25, gomp_destroy_nest_lock_25, + gomp_set_nest_lock_25, gomp_unset_nest_lock_25, + gomp_test_nest_lock_25): New prototypes. + (omp_lock_symver, strong_alias): Define. + (gomp_remaining_threads_count, gomp_remaining_threads_lock): New + decls. + (gomp_end_task): New. + (struct gomp_task_icv, gomp_global_icv): New. + (gomp_thread_limit_var, gomp_max_active_levels_var): New. + (struct gomp_task): New. + (gomp_nthreads_var, gomp_dyn_var, gomp_nest_var, + gomp_run_sched_var, gomp_run_sched_chunk): Remove. + (gomp_icv): New. + (gomp_schedule_type): Reorder enum to match + omp_sched_t. + * team.c (struct gomp_thread_start_data): Add thread_pool and task + fields. + (gomp_thread_start): Add gomp_team_barrier_wait call. + For non-nested case remove clearing of docked thread thr fields. + Use pool fields instead of global gomp_* variables. Use + gomp_barrier_wait_last when needed. Initialize ts.active_level. + Create tasks for each member thread. + (free_team): Only destroy team barrier, task_lock here and free it. + (gomp_free_thread): Free last_team if non-NULL. + (gomp_team_end): Call gomp_team_barrier_wait instead of + gomp_barrier_wait. For nested case call one extra + gomp_barrier_wait. Move here some destruction from free_team. + Call free_team on pool->last_team if any, rather than freeing + current team. Destroy work_share_list_free_lock ifndef + HAVE_SYNC_BUILTINS. + (gomp_new_icv): New function. + (gomp_threads, gomp_threads_size, gomp_threads_used, + gomp_threads_dock): Removed. + (gomp_thread_destructor): New variable. + (gomp_new_thread_pool, gomp_free_pool_helper, gomp_free_thread): New + functions. + (gomp_team_start): Create new pool if current thread doesn't have + one. Use pool fields instead of global gomp_* variables. + Initialize thread_pool field for new threads. Clear single_count. + Change last argument from ws to team, don't create + new team, set ts.work_share to &team->work_shares[0] and clear + ts.last_work_share. Don't clear ts.work_share_generation. + If number of threads changed, adjust atomically gomp_managed_threads. + Use gomp_init_task instead of gomp_new_task, + set thr->task to the corresponding implicit_task array entry. + Create tasks for each member thread. Initialize ts.level. + (initialize_team): Call pthread_key_create on + gomp_thread_destructor. + (team_destructor): New function. + (new_team): Removed. + (gomp_new_team): New function. + (free_team): Free gomp_work_share blocks chained through next_alloc, + instead of freeing work_shares and destroying work_share_lock. + (gomp_team_end): Call gomp_fini_work_share. If number of threads + changed, adjust atomically gomp_managed_threads. Use gomp_end_task. + * barrier.c (GOMP_barrier): Call gomp_team_barrier_wait instead + of gomp_barrier_wait. + * single.c (GOMP_single_copy_start): Call gomp_team_barrier_wait + instead of gomp_barrier_wait. Call gomp_work_share_init_done + if gomp_work_share_start returned true. Don't unlock ws->lock. + (GOMP_single_copy_end): Call gomp_team_barrier_wait instead + of gomp_barrier_wait. + (GOMP_single_start): Rewritten if HAVE_SYNC_BUILTINS. Call + gomp_work_share_init_done if gomp_work_share_start returned true. + Don't unlock ws->lock. + * work.c: Include stddef.h. + (free_work_share): Use work_share_list_free_lock instead + of atomic chaining ifndef HAVE_SYNC_BUILTINS. Add team argument. + Call gomp_fini_work_share and then either free ws if orphaned, or + put it into work_share_list_free list of the current team. + (alloc_work_share, gomp_init_work_share, gomp_fini_work_share): New + functions. + (gomp_work_share_start, gomp_work_share_end, + gomp_work_share_end_nowait): Rewritten. + * omp_lib.f90.in Change some tabs to spaces to prevent warnings. + (openmp_version): Set to 200805. + (omp_sched_kind, omp_sched_static, omp_sched_dynamic, + omp_sched_guided, omp_sched_auto): New parameters. + (omp_set_schedule, omp_get_schedule, omp_get_thread_limit, + omp_set_max_active_levels, omp_get_max_active_levels, + omp_get_level, omp_get_ancestor_thread_num, omp_get_team_size, + omp_get_active_level): New interfaces. + * omp_lib.h.in (openmp_version): Set to 200805. + (omp_sched_kind, omp_sched_static, omp_sched_dynamic, + omp_sched_guided, omp_sched_auto): New parameters. + (omp_set_schedule, omp_get_schedule, omp_get_thread_limit, + omp_set_max_active_levels, omp_get_max_active_levels, + omp_get_level, omp_get_ancestor_thread_num, omp_get_team_size, + omp_get_active_level): New externals. + * loop.c: Include limits.h. + (GOMP_loop_runtime_next, GOMP_loop_ordered_runtime_next): Handle + GFS_AUTO. + (GOMP_loop_runtime_start, GOMP_loop_ordered_runtime_start): + Likewise. Use gomp_icv. + (gomp_loop_static_start, gomp_loop_dynamic_start): Clear + ts.static_trip here. + (gomp_loop_static_start, gomp_loop_ordered_static_start): Call + gomp_work_share_init_done after gomp_loop_init. Don't unlock ws->lock. + (gomp_loop_dynamic_start, gomp_loop_guided_start): Call + gomp_work_share_init_done after gomp_loop_init. If HAVE_SYNC_BUILTINS, + don't unlock ws->lock, otherwise lock it. + (gomp_loop_ordered_dynamic_start, gomp_loop_ordered_guided_start): Call + gomp_work_share_init_done after gomp_loop_init. Lock ws->lock. + (gomp_parallel_loop_start): Call gomp_new_team instead of + gomp_new_work_share. Call gomp_loop_init on &team->work_shares[0]. + Adjust gomp_team_start caller. Pass 0 as second argument to + gomp_resolve_num_threads. + (gomp_loop_init): For GFS_DYNAMIC, multiply ws->chunk_size by incr. + If adding ws->chunk_size nthreads + 1 times after end won't + overflow, set ws->mode to 1. + * libgomp_g.h (GOMP_loop_ull_static_start, GOMP_loop_ull_dynamic_start, + GOMP_loop_ull_guided_start, GOMP_loop_ull_runtime_start, + GOMP_loop_ull_ordered_static_start, + GOMP_loop_ull_ordered_dynamic_start, + GOMP_loop_ull_ordered_guided_start, + GOMP_loop_ull_ordered_runtime_start, GOMP_loop_ull_static_next, + GOMP_loop_ull_dynamic_next, GOMP_loop_ull_guided_next, + GOMP_loop_ull_runtime_next, GOMP_loop_ull_ordered_static_next, + GOMP_loop_ull_ordered_dynamic_next, GOMP_loop_ull_ordered_guided_next, + GOMP_loop_ull_ordered_runtime_next, GOMP_task, GOMP_taskwait): New + prototypes. + * libgomp.map: Export lock routines also @@OMP_2.0. + (GOMP_loop_ordered_dynamic_first, + GOMP_loop_ordered_guided_first, GOMP_loop_ordered_runtime_first, + GOMP_loop_ordered_static_first): Remove. + (GOMP_loop_ull_dynamic_next, GOMP_loop_ull_dynamic_start, + GOMP_loop_ull_guided_next, GOMP_loop_ull_guided_start, + GOMP_loop_ull_ordered_dynamic_next, + GOMP_loop_ull_ordered_dynamic_start, + GOMP_loop_ull_ordered_guided_next, + GOMP_loop_ull_ordered_guided_start, + GOMP_loop_ull_ordered_runtime_next, + GOMP_loop_ull_ordered_runtime_start, + GOMP_loop_ull_ordered_static_next, + GOMP_loop_ull_ordered_static_start, + GOMP_loop_ull_runtime_next, GOMP_loop_ull_runtime_start, + GOMP_loop_ull_static_next, GOMP_loop_ull_static_start, + GOMP_task, GOMP_taskwait): Export @@GOMP_2.0. + (omp_set_schedule, omp_get_schedule, + omp_get_thread_limit, omp_set_max_active_levels, + omp_get_max_active_levels, omp_get_level, + omp_get_ancestor_thread_num, omp_get_team_size, omp_get_active_level, + omp_set_schedule_, omp_set_schedule_8_, + omp_get_schedule_, omp_get_schedule_8_, omp_get_thread_limit_, + omp_set_max_active_levels_, omp_set_max_active_levels_8_, + omp_get_max_active_levels_, omp_get_level_, + omp_get_ancestor_thread_num_, omp_get_ancestor_thread_num_8_, + omp_get_team_size_, omp_get_team_size_8_, omp_get_active_level_): + New exports @@OMP_3.0. + * omp.h.in (omp_sched_t): New type. + (omp_set_schedule, omp_get_schedule, omp_get_thread_limit, + omp_set_max_active_levels, omp_get_max_active_levels, + omp_get_level, omp_get_ancestor_thread_num, omp_get_team_size, + omp_get_active_level): New prototypes. + * env.c (gomp_spin_count_var, gomp_throttled_spin_count_var, + gomp_available_cpus, gomp_managed_threads, gomp_max_active_levels_var, + gomp_thread_limit_var, gomp_remaining_threads_count, + gomp_remaining_threads_lock): New variables. + (parse_spincount): New function. + (initialize_env): Call gomp_init_num_threads unconditionally. + Initialize gomp_available_cpus. Call parse_spincount, + initialize gomp_{,throttled_}spin_count_var + depending on presence and value of OMP_WAIT_POLICY and + GOMP_SPINCOUNT env vars. Handle GOMP_BLOCKTIME env var. + Handle OMP_WAIT_POLICY, OMP_MAX_ACTIVE_LEVELS, + OMP_THREAD_LIMIT, OMP_STACKSIZE env vars. Handle unit specification + for GOMP_STACKSIZE. Initialize gomp_remaining_threads_count and + gomp_remaining_threads_lock if needed. Use gomp_global_icv. + (gomp_nthreads_var, gomp_dyn_var, gomp_nest_var, + gomp_run_sched_var, gomp_run_sched_chunk): Remove. + (gomp_global_icv): New. + (parse_schedule): Use it. Parse "auto". + (omp_set_num_threads): Use gomp_icv. + (omp_set_dynamic, omp_get_dynamic, omp_set_nested, omp_get_nested): + Likewise. + (omp_get_max_threads): Move from parallel.c. + (omp_set_schedule, omp_get_schedule, omp_get_thread_limit, + omp_set_max_active_levels, omp_get_max_active_levels): New functions, + add ialias. + (parse_stacksize, parse_wait_policy): New functions. + * fortran.c: Rewrite lock wrappers, if symbol versioning provide + both wrappers for compatibility and new locks. + (omp_set_schedule, omp_get_schedule, + omp_get_thread_limit, omp_set_max_active_levels, + omp_get_max_active_levels, omp_get_level, + omp_get_ancestor_thread_num, omp_get_team_size, + omp_get_active_level): New ialias_redirect. + (omp_set_schedule_, omp_set_schedule_8_, + omp_get_schedule_, omp_get_schedule_8_, omp_get_thread_limit_, + omp_set_max_active_levels_, omp_set_max_active_levels_8_, + omp_get_max_active_levels_, omp_get_level_, + omp_get_ancestor_thread_num_, omp_get_ancestor_thread_num_8_, + omp_get_team_size_, omp_get_team_size_8_, omp_get_active_level_): + New functions. + * parallel.c: Include limits.h. + (gomp_resolve_num_threads): Add count argument. Rewritten. + (GOMP_parallel_start): Call gomp_new_team and pass that as last + argument to gomp_team_start. Pass 0 as second argument to + gomp_resolve_num_threads. + (GOMP_parallel_end): Decrease gomp_remaining_threads_count + if gomp_thread_limit_var != ULONG_MAX. + (omp_in_parallel): Implement using ts.active_level. + (omp_get_max_threads): Move to env.c. + (omp_get_level, omp_get_ancestor_thread_num, + omp_get_team_size, omp_get_active_level): New functions, + add ialias. + * sections.c (GOMP_sections_start): Call gomp_work_share_init_done + after gomp_sections_init. If HAVE_SYNC_BUILTINS, call + gomp_iter_dynamic_next instead of the _locked variant and don't take + lock around it, otherwise acquire it before calling + gomp_iter_dynamic_next_locked. + (GOMP_sections_next): If HAVE_SYNC_BUILTINS, call + gomp_iter_dynamic_next instead of the _locked variant and don't take + lock around it. + (GOMP_parallel_sections_start): Call gomp_new_team instead of + gomp_new_work_share. Call gomp_sections_init on &team->work_shares[0]. + Adjust gomp_team_start caller. Pass count as second argument to + gomp_resolve_num_threads, don't adjust num_threads after the call. + Use gomp_icv. + * iter.c (gomp_iter_dynamic_next_locked): Don't multiply + ws->chunk_size by incr. + (gomp_iter_dynamic_next): Likewise. If ws->mode, use more efficient + code. + * libgomp_f.h.in (omp_lock_25_arg_t, omp_nest_lock_25_arg_t): New + types. + (omp_lock_25_arg, omp_nest_lock_25_arg): New macros. + (omp_check_defines): Check even the compat defines. + * config/linux/ptrlock.c: New file. + * config/linux/ptrlock.h: New file. + * config/linux/wait.h: New file. + * config/posix/ptrlock.c: New file. + * config/posix/ptrlock.h: New file. + * config/linux/bar.h (gomp_team_barrier_wait, + gomp_team_barrier_wait_end, gomp_team_barrier_wake): New prototypes. + (gomp_team_barrier_set_task_pending, + gomp_team_barrier_clear_task_pending, + gomp_team_barrier_set_waiting_for_tasks, + gomp_team_barrier_waiting_for_tasks, + gomp_team_barrier_done): New inlines. + (gomp_barrier_t): Rewritten. + (gomp_barrier_state_t): New typedef. + (gomp_barrier_init, gomp_barrier_reinit, gomp_barrier_destroy, + gomp_barrier_wait_start): Rewritten. + (gomp_barrier_wait_end): Change second argument to + gomp_barrier_state_t. + (gomp_barrier_last_thread, gomp_barrier_wait_last): New static + inlines. + * config/linux/bar.c: Include wait.h instead of libgomp.h and + futex.h. + (gomp_barrier_wait_end): Rewritten. + (gomp_team_barrier_wait, gomp_team_barrier_wait_end, + gomp_team_barrier_wake, gomp_barrier_wait_last): New functions. + * config/posix/bar.h (gomp_barrier_t): Add generation field. + (gomp_barrier_state_t): New typedef. + (gomp_team_barrier_wait, + gomp_team_barrier_wait_end, gomp_team_barrier_wake): New prototypes. + (gomp_barrier_wait_start): Or all but low 2 bits from generation + into the return value. Return gomp_barrier_state_t. + (gomp_team_barrier_set_task_pending, + gomp_team_barrier_clear_task_pending, + gomp_team_barrier_set_waiting_for_tasks, + gomp_team_barrier_waiting_for_tasks, + gomp_team_barrier_done): New inlines. + (gomp_barrier_wait_end): Change second argument to + gomp_barrier_state_t. + (gomp_barrier_last_thread, gomp_barrier_wait_last): New static + inlines. + * config/posix/bar.c (gomp_barrier_init): Clear generation field. + (gomp_barrier_wait_end): Change second argument to + gomp_barrier_state_t. + (gomp_team_barrier_wait, gomp_team_barrier_wait_end, + gomp_team_barrier_wake): New functions. + * config/linux/mutex.c: Include wait.h instead of libgomp.h and + futex.h. + (gomp_futex_wake, gomp_futex_wait): New variables. + (gomp_mutex_lock_slow): Call do_wait instead of futex_wait. + * config/linux/lock.c: Rewrite to make locks task owned, + for backwards compatibility provide the old entrypoints + if symbol versioning. Include wait.h instead of libgomp.h and + futex.h. + (gomp_set_nest_lock_25): Call do_wait instead of futex_wait. + * config/posix95/lock.c: Rewrite to make locks task owned, + for backwards compatibility provide the old entrypoints + if symbol versioning. + * config/posix/lock.c: Rewrite to make locks task owned, + for backwards compatibility provide the old entrypoints + if symbol versioning. + * config/linux/proc.c (gomp_init_num_threads): Use gomp_global_icv. + (get_num_procs, gomp_dynamic_max_threads): Use gomp_icv. + * config/posix/proc.c, config/mingw32/proc.c: Similarly. + * config/linux/powerpc/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove. + (sys_futex0): Return error code. + (futex_wake, futex_wait): If ENOSYS was returned, clear + FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry. + (cpu_relax, atomic_write_barrier): New static inlines. + * config/linux/alpha/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove. + (futex_wake, futex_wait): If ENOSYS was returned, clear + FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry. + (cpu_relax, atomic_write_barrier): New static inlines. + * config/linux/x86/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove. + (sys_futex0): Return error code. + (futex_wake, futex_wait): If ENOSYS was returned, clear + FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry. + (cpu_relax, atomic_write_barrier): New static inlines. + * config/linux/s390/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove. + (sys_futex0): Return error code. + (futex_wake, futex_wait): If ENOSYS was returned, clear + FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry. + (cpu_relax, atomic_write_barrier): New static inlines. + * config/linux/ia64/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove. + (sys_futex0): Return error code. + (futex_wake, futex_wait): If ENOSYS was returned, clear + FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry. + (cpu_relax, atomic_write_barrier): New static inlines. + * config/linux/sparc/futex.h (FUTEX_WAIT, FUTEX_WAKE): Remove. + (sys_futex0): Return error code. + (futex_wake, futex_wait): If ENOSYS was returned, clear + FUTEX_PRIVATE_FLAG in gomp_futex_wa{ke,it} and retry. + (cpu_relax, atomic_write_barrier): New static inlines. + * config/linux/sem.c: Include wait.h instead of libgomp.h and + futex.h. + (gomp_sem_wait_slow): Call do_wait instead of futex_wait. + * config/linux/affinity.c: Assume HAVE_SYNC_BUILTINS. + * config/linux/omp-lock.h (omp_lock_25_t, omp_nest_lock_25_t): New + types. + (omp_nest_lock_t): Change owner into void *, add lock field. + * config/posix95/omp-lock.h: Include semaphore.h. + (omp_lock_25_t, omp_nest_lock_25_t): New types. + (omp_lock_t): Use sem_t instead of mutex if semaphores + aren't broken. + (omp_nest_lock_t): Likewise. Change owner to void *. + * config/posix/omp-lock.h: Include semaphore.h. + (omp_lock_25_t, omp_nest_lock_25_t): New types. + (omp_lock_t): Use sem_t instead of mutex if semaphores + aren't broken. + (omp_nest_lock_t): Likewise. Add owner field. + +2008-06-06 Jakub Jelinek <jakub@redhat.com> + + * testsuite/libgomp.c/collapse-1.c: New test. + * testsuite/libgomp.c/collapse-2.c: New test. + * testsuite/libgomp.c/collapse-3.c: New test. + * testsuite/libgomp.c/icv-1.c: New test. + * testsuite/libgomp.c/icv-2.c: New test. + * testsuite/libgomp.c/lib-2.c: New test. + * testsuite/libgomp.c/lock-1.c: New test. + * testsuite/libgomp.c/lock-2.c: New test. + * testsuite/libgomp.c/lock-3.c: New test. + * testsuite/libgomp.c/loop-4.c: New test. + * testsuite/libgomp.c/loop-5.c: New test. + * testsuite/libgomp.c/loop-6.c: New test. + * testsuite/libgomp.c/loop-7.c: New test. + * testsuite/libgomp.c/loop-8.c: New test. + * testsuite/libgomp.c/loop-9.c: New test. + * testsuite/libgomp.c/nested-3.c: New test. + * testsuite/libgomp.c/nestedfn-6.c: New test. + * testsuite/libgomp.c/sort-1.c: New test. + * testsuite/libgomp.c/task-1.c: New test. + * testsuite/libgomp.c/task-2.c: New test. + * testsuite/libgomp.c/task-3.c: New test. + * testsuite/libgomp.c/task-4.c: New test. + * testsuite/libgomp.c++/c++.exp: Add libstdc++-v3 build includes + to C++ testsuite default compiler options. + * testsuite/libgomp.c++/collapse-1.C: New test. + * testsuite/libgomp.c++/collapse-2.C: New test. + * testsuite/libgomp.c++/ctor-10.C: New test. + * testsuite/libgomp.c++/for-1.C: New test. + * testsuite/libgomp.c++/for-2.C: New test. + * testsuite/libgomp.c++/for-3.C: New test. + * testsuite/libgomp.c++/for-4.C: New test. + * testsuite/libgomp.c++/for-5.C: New test. + * testsuite/libgomp.c++/loop-8.C: New test. + * testsuite/libgomp.c++/loop-9.C: New test. + * testsuite/libgomp.c++/loop-10.C: New test. + * testsuite/libgomp.c++/task-1.C: New test. + * testsuite/libgomp.c++/task-2.C: New test. + * testsuite/libgomp.c++/task-3.C: New test. + * testsuite/libgomp.c++/task-4.C: New test. + * testsuite/libgomp.c++/task-5.C: New test. + * testsuite/libgomp.c++/task-6.C: New test. + * testsuite/libgomp.fortran/allocatable1.f90: New test. + * testsuite/libgomp.fortran/allocatable2.f90: New test. + * testsuite/libgomp.fortran/allocatable3.f90: New test. + * testsuite/libgomp.fortran/allocatable4.f90: New test. + * testsuite/libgomp.fortran/collapse1.f90: New test. + * testsuite/libgomp.fortran/collapse2.f90: New test. + * testsuite/libgomp.fortran/collapse3.f90: New test. + * testsuite/libgomp.fortran/collapse4.f90: New test. + * testsuite/libgomp.fortran/lastprivate1.f90: New test. + * testsuite/libgomp.fortran/lastprivate2.f90: New test. + * testsuite/libgomp.fortran/lib4.f90: New test. + * testsuite/libgomp.fortran/lock-1.f90: New test. + * testsuite/libgomp.fortran/lock-2.f90: New test. + * testsuite/libgomp.fortran/nested1.f90: New test. + * testsuite/libgomp.fortran/nestedfn4.f90: New test. + * testsuite/libgomp.fortran/strassen.f90: New test. + * testsuite/libgomp.fortran/tabs1.f90: New test. + * testsuite/libgomp.fortran/tabs2.f: New test. + * testsuite/libgomp.fortran/task1.f90: New test. + * testsuite/libgomp.fortran/task2.f90: New test. + * testsuite/libgomp.fortran/vla4.f90: Add dg-warning. + * testsuite/libgomp.fortran/vla5.f90: Likewise. + * testsuite/libgomp.c/pr26943-2.c: Likewise. + * testsuite/libgomp.c/pr26943-3.c: Likewise. + * testsuite/libgomp.c/pr26943-4.c: Likewise. + 2008-05-23 Jakub Jelinek <jakub@redhat.com> PR c++/36308 diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am index 55e3bf3ee15..996802a913a 100644 --- a/libgomp/Makefile.am +++ b/libgomp/Makefile.am @@ -30,8 +30,9 @@ libgomp_version_info = -version-info $(libtool_VERSION) libgomp_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script) libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \ - loop.c ordered.c parallel.c sections.c single.c team.c work.c \ - lock.c mutex.c proc.c sem.c bar.c time.c fortran.c affinity.c + iter_ull.c loop.c loop_ull.c ordered.c parallel.c sections.c single.c \ + task.c team.c work.c lock.c mutex.c proc.c sem.c bar.c ptrlock.c \ + time.c fortran.c affinity.c nodist_noinst_HEADERS = libgomp_f.h nodist_libsubinclude_HEADERS = omp.h diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in index 0bdf07721cc..8e0f546c6df 100644 --- a/libgomp/Makefile.in +++ b/libgomp/Makefile.in @@ -83,9 +83,10 @@ toolexeclibLTLIBRARIES_INSTALL = $(INSTALL) LTLIBRARIES = $(toolexeclib_LTLIBRARIES) libgomp_la_LIBADD = am_libgomp_la_OBJECTS = alloc.lo barrier.lo critical.lo env.lo \ - error.lo iter.lo loop.lo ordered.lo parallel.lo sections.lo \ - single.lo team.lo work.lo lock.lo mutex.lo proc.lo sem.lo \ - bar.lo time.lo fortran.lo affinity.lo + error.lo iter.lo iter_ull.lo loop.lo loop_ull.lo ordered.lo \ + parallel.lo sections.lo single.lo task.lo team.lo work.lo \ + lock.lo mutex.lo proc.lo sem.lo bar.lo ptrlock.lo time.lo \ + fortran.lo affinity.lo libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS) DEFAULT_INCLUDES = -I. -I$(srcdir) -I. depcomp = $(SHELL) $(top_srcdir)/../depcomp @@ -193,9 +194,15 @@ MAINTAINER_MODE_TRUE = @MAINTAINER_MODE_TRUE@ MAKEINFO = @MAKEINFO@ NM = @NM@ OBJEXT = @OBJEXT@ +OMP_LOCK_25_ALIGN = @OMP_LOCK_25_ALIGN@ +OMP_LOCK_25_KIND = @OMP_LOCK_25_KIND@ +OMP_LOCK_25_SIZE = @OMP_LOCK_25_SIZE@ OMP_LOCK_ALIGN = @OMP_LOCK_ALIGN@ OMP_LOCK_KIND = @OMP_LOCK_KIND@ OMP_LOCK_SIZE = @OMP_LOCK_SIZE@ +OMP_NEST_LOCK_25_ALIGN = @OMP_NEST_LOCK_25_ALIGN@ +OMP_NEST_LOCK_25_KIND = @OMP_NEST_LOCK_25_KIND@ +OMP_NEST_LOCK_25_SIZE = @OMP_NEST_LOCK_25_SIZE@ OMP_NEST_LOCK_ALIGN = @OMP_NEST_LOCK_ALIGN@ OMP_NEST_LOCK_KIND = @OMP_NEST_LOCK_KIND@ OMP_NEST_LOCK_SIZE = @OMP_NEST_LOCK_SIZE@ @@ -289,8 +296,9 @@ nodist_toolexeclib_HEADERS = libgomp.spec libgomp_version_info = -version-info $(libtool_VERSION) libgomp_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script) libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \ - loop.c ordered.c parallel.c sections.c single.c team.c work.c \ - lock.c mutex.c proc.c sem.c bar.c time.c fortran.c affinity.c + iter_ull.c loop.c loop_ull.c ordered.c parallel.c sections.c single.c \ + task.c team.c work.c lock.c mutex.c proc.c sem.c bar.c ptrlock.c \ + time.c fortran.c affinity.c nodist_noinst_HEADERS = libgomp_f.h nodist_libsubinclude_HEADERS = omp.h @@ -426,15 +434,19 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/error.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fortran.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter.Plo@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter_ull.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/lock.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/loop.Plo@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/loop_ull.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/mutex.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ordered.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/parallel.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/proc.Plo@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ptrlock.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sections.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sem.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/single.Plo@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/task.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/team.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/time.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/work.Plo@am__quote@ diff --git a/libgomp/barrier.c b/libgomp/barrier.c index bcad683af5e..24037aba17b 100644 --- a/libgomp/barrier.c +++ b/libgomp/barrier.c @@ -40,5 +40,5 @@ GOMP_barrier (void) if (team == NULL) return; - gomp_barrier_wait (&team->barrier); + gomp_team_barrier_wait (&team->barrier); } diff --git a/libgomp/config.h.in b/libgomp/config.h.in index eed8a8837af..88a616ca1ed 100644 --- a/libgomp/config.h.in +++ b/libgomp/config.h.in @@ -69,6 +69,9 @@ /* Define to 1 if you have the <unistd.h> header file. */ #undef HAVE_UNISTD_H +/* Define to 1 if GNU symbol versioning is used for libgomp. */ +#undef LIBGOMP_GNU_SYMBOL_VERSIONING + /* Define to the sub-directory in which libtool stores uninstalled libraries. */ #undef LT_OBJDIR diff --git a/libgomp/config/linux/affinity.c b/libgomp/config/linux/affinity.c index 8fcce5f3a5b..7b6d6c008d7 100644 --- a/libgomp/config/linux/affinity.c +++ b/libgomp/config/linux/affinity.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2006, 2007 Free Software Foundation, Inc. +/* Copyright (C) 2006, 2007, 2008 Free Software Foundation, Inc. Contributed by Jakub Jelinek <jakub@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -38,9 +38,6 @@ #ifdef HAVE_PTHREAD_AFFINITY_NP static unsigned int affinity_counter; -#ifndef HAVE_SYNC_BUILTINS -static gomp_mutex_t affinity_lock; -#endif void gomp_init_affinity (void) @@ -76,9 +73,6 @@ gomp_init_affinity (void) CPU_SET (gomp_cpu_affinity[0], &cpuset); pthread_setaffinity_np (pthread_self (), sizeof (cpuset), &cpuset); affinity_counter = 1; -#ifndef HAVE_SYNC_BUILTINS - gomp_mutex_init (&affinity_lock); -#endif } void @@ -87,13 +81,7 @@ gomp_init_thread_affinity (pthread_attr_t *attr) unsigned int cpu; cpu_set_t cpuset; -#ifdef HAVE_SYNC_BUILTINS cpu = __sync_fetch_and_add (&affinity_counter, 1); -#else - gomp_mutex_lock (&affinity_lock); - cpu = affinity_counter++; - gomp_mutex_unlock (&affinity_lock); -#endif cpu %= gomp_cpu_affinity_len; CPU_ZERO (&cpuset); CPU_SET (gomp_cpu_affinity[cpu], &cpuset); diff --git a/libgomp/config/linux/alpha/futex.h b/libgomp/config/linux/alpha/futex.h index 98681a8526e..4f0bda2424c 100644 --- a/libgomp/config/linux/alpha/futex.h +++ b/libgomp/config/linux/alpha/futex.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -30,8 +30,6 @@ #ifndef SYS_futex #define SYS_futex 394 #endif -#define FUTEX_WAIT 0 -#define FUTEX_WAKE 1 static inline void @@ -45,7 +43,7 @@ futex_wait (int *addr, int val) sc_0 = SYS_futex; sc_16 = (long) addr; - sc_17 = FUTEX_WAIT; + sc_17 = gomp_futex_wait; sc_18 = val; sc_19 = 0; __asm volatile ("callsys" @@ -53,6 +51,20 @@ futex_wait (int *addr, int val) : "0"(sc_0), "r" (sc_16), "r"(sc_17), "r"(sc_18), "1"(sc_19) : "$1", "$2", "$3", "$4", "$5", "$6", "$7", "$8", "$22", "$23", "$24", "$25", "$27", "$28", "memory"); + if (__builtin_expect (sc_19, 0) && sc_0 == ENOSYS) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sc_0 = SYS_futex; + sc_17 &= ~FUTEX_PRIVATE_FLAG; + sc_19 = 0; + __asm volatile ("callsys" + : "=r" (sc_0), "=r"(sc_19) + : "0"(sc_0), "r" (sc_16), "r"(sc_17), "r"(sc_18), + "1"(sc_19) + : "$1", "$2", "$3", "$4", "$5", "$6", "$7", "$8", + "$22", "$23", "$24", "$25", "$27", "$28", "memory"); + } } static inline void @@ -66,11 +78,35 @@ futex_wake (int *addr, int count) sc_0 = SYS_futex; sc_16 = (long) addr; - sc_17 = FUTEX_WAKE; + sc_17 = gomp_futex_wake; sc_18 = count; __asm volatile ("callsys" : "=r" (sc_0), "=r"(sc_19) : "0"(sc_0), "r" (sc_16), "r"(sc_17), "r"(sc_18) : "$1", "$2", "$3", "$4", "$5", "$6", "$7", "$8", "$22", "$23", "$24", "$25", "$27", "$28", "memory"); + if (__builtin_expect (sc_19, 0) && sc_0 == ENOSYS) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sc_0 = SYS_futex; + sc_17 &= ~FUTEX_PRIVATE_FLAG; + __asm volatile ("callsys" + : "=r" (sc_0), "=r"(sc_19) + : "0"(sc_0), "r" (sc_16), "r"(sc_17), "r"(sc_18) + : "$1", "$2", "$3", "$4", "$5", "$6", "$7", "$8", + "$22", "$23", "$24", "$25", "$27", "$28", "memory"); + } +} + +static inline void +cpu_relax (void) +{ + __asm volatile ("" : : : "memory"); +} + +static inline void +atomic_write_barrier (void) +{ + __asm volatile ("wmb" : : : "memory"); } diff --git a/libgomp/config/linux/bar.c b/libgomp/config/linux/bar.c index 5c4f32e6f8b..7af36d2c421 100644 --- a/libgomp/config/linux/bar.c +++ b/libgomp/config/linux/bar.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -29,36 +29,97 @@ mechanism for libgomp. This type is private to the library. This implementation uses atomic instructions and the futex syscall. */ -#include "libgomp.h" -#include "futex.h" #include <limits.h> +#include "wait.h" void -gomp_barrier_wait_end (gomp_barrier_t *bar, bool last) +gomp_barrier_wait_end (gomp_barrier_t *bar, gomp_barrier_state_t state) { - if (last) + if (__builtin_expect ((state & 1) != 0, 0)) { - bar->generation++; - futex_wake (&bar->generation, INT_MAX); + /* Next time we'll be awaiting TOTAL threads again. */ + bar->awaited = bar->total; + atomic_write_barrier (); + bar->generation += 4; + futex_wake ((int *) &bar->generation, INT_MAX); } else { - unsigned int generation = bar->generation; - - gomp_mutex_unlock (&bar->mutex); + unsigned int generation = state; do - futex_wait (&bar->generation, generation); + do_wait ((int *) &bar->generation, generation); while (bar->generation == generation); } +} + +void +gomp_barrier_wait (gomp_barrier_t *bar) +{ + gomp_barrier_wait_end (bar, gomp_barrier_wait_start (bar)); +} - if (__sync_add_and_fetch (&bar->arrived, -1) == 0) - gomp_mutex_unlock (&bar->mutex); +/* Like gomp_barrier_wait, except that if the encountering thread + is not the last one to hit the barrier, it returns immediately. + The intended usage is that a thread which intends to gomp_barrier_destroy + this barrier calls gomp_barrier_wait, while all other threads + call gomp_barrier_wait_last. When gomp_barrier_wait returns, + the barrier can be safely destroyed. */ + +void +gomp_barrier_wait_last (gomp_barrier_t *bar) +{ + gomp_barrier_state_t state = gomp_barrier_wait_start (bar); + if (state & 1) + gomp_barrier_wait_end (bar, state); +} + +void +gomp_team_barrier_wake (gomp_barrier_t *bar, int count) +{ + futex_wake ((int *) &bar->generation, count == 0 ? INT_MAX : count); +} + +void +gomp_team_barrier_wait_end (gomp_barrier_t *bar, gomp_barrier_state_t state) +{ + unsigned int generation; + + if (__builtin_expect ((state & 1) != 0, 0)) + { + /* Next time we'll be awaiting TOTAL threads again. */ + struct gomp_thread *thr = gomp_thread (); + struct gomp_team *team = thr->ts.team; + bar->awaited = bar->total; + atomic_write_barrier (); + if (__builtin_expect (team->task_count, 0)) + { + gomp_barrier_handle_tasks (state); + state &= ~1; + } + else + { + bar->generation = state + 3; + futex_wake ((int *) &bar->generation, INT_MAX); + return; + } + } + + generation = state; + do + { + do_wait ((int *) &bar->generation, generation); + if (__builtin_expect (bar->generation & 1, 0)) + gomp_barrier_handle_tasks (state); + if ((bar->generation & 2)) + generation |= 2; + } + while (bar->generation != state + 4); } void -gomp_barrier_wait (gomp_barrier_t *barrier) +gomp_team_barrier_wait (gomp_barrier_t *bar) { - gomp_barrier_wait_end (barrier, gomp_barrier_wait_start (barrier)); + gomp_team_barrier_wait_end (bar, gomp_barrier_wait_start (bar)); } diff --git a/libgomp/config/linux/bar.h b/libgomp/config/linux/bar.h index 57268585d8b..85150caf189 100644 --- a/libgomp/config/linux/bar.h +++ b/libgomp/config/linux/bar.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -36,40 +36,86 @@ typedef struct { - gomp_mutex_t mutex; - unsigned total; - unsigned arrived; - int generation; + /* Make sure total/generation is in a mostly read cacheline, while + awaited in a separate cacheline. */ + unsigned total __attribute__((aligned (64))); + unsigned generation; + unsigned awaited __attribute__((aligned (64))); } gomp_barrier_t; +typedef unsigned int gomp_barrier_state_t; static inline void gomp_barrier_init (gomp_barrier_t *bar, unsigned count) { - gomp_mutex_init (&bar->mutex); bar->total = count; - bar->arrived = 0; + bar->awaited = count; bar->generation = 0; } static inline void gomp_barrier_reinit (gomp_barrier_t *bar, unsigned count) { - gomp_mutex_lock (&bar->mutex); + __sync_fetch_and_add (&bar->awaited, count - bar->total); bar->total = count; - gomp_mutex_unlock (&bar->mutex); } static inline void gomp_barrier_destroy (gomp_barrier_t *bar) { - /* Before destroying, make sure all threads have left the barrier. */ - gomp_mutex_lock (&bar->mutex); } extern void gomp_barrier_wait (gomp_barrier_t *); -extern void gomp_barrier_wait_end (gomp_barrier_t *, bool); +extern void gomp_barrier_wait_last (gomp_barrier_t *); +extern void gomp_barrier_wait_end (gomp_barrier_t *, gomp_barrier_state_t); +extern void gomp_team_barrier_wait (gomp_barrier_t *); +extern void gomp_team_barrier_wait_end (gomp_barrier_t *, + gomp_barrier_state_t); +extern void gomp_team_barrier_wake (gomp_barrier_t *, int); -static inline bool gomp_barrier_wait_start (gomp_barrier_t *bar) +static inline gomp_barrier_state_t +gomp_barrier_wait_start (gomp_barrier_t *bar) { - gomp_mutex_lock (&bar->mutex); - return ++bar->arrived == bar->total; + unsigned int ret = bar->generation & ~3; + /* Do we need any barrier here or is __sync_add_and_fetch acting + as the needed LoadLoad barrier already? */ + ret += __sync_add_and_fetch (&bar->awaited, -1) == 0; + return ret; +} + +static inline bool +gomp_barrier_last_thread (gomp_barrier_state_t state) +{ + return state & 1; +} + +/* All the inlines below must be called with team->task_lock + held. */ + +static inline void +gomp_team_barrier_set_task_pending (gomp_barrier_t *bar) +{ + bar->generation |= 1; +} + +static inline void +gomp_team_barrier_clear_task_pending (gomp_barrier_t *bar) +{ + bar->generation &= ~1; +} + +static inline void +gomp_team_barrier_set_waiting_for_tasks (gomp_barrier_t *bar) +{ + bar->generation |= 2; +} + +static inline bool +gomp_team_barrier_waiting_for_tasks (gomp_barrier_t *bar) +{ + return (bar->generation & 2) != 0; +} + +static inline void +gomp_team_barrier_done (gomp_barrier_t *bar, gomp_barrier_state_t state) +{ + bar->generation = (state & ~3) + 4; } #endif /* GOMP_BARRIER_H */ diff --git a/libgomp/config/linux/ia64/futex.h b/libgomp/config/linux/ia64/futex.h index 5e54982d6f7..35aa4f1fea0 100644 --- a/libgomp/config/linux/ia64/futex.h +++ b/libgomp/config/linux/ia64/futex.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -29,23 +29,24 @@ #include <sys/syscall.h> -#define FUTEX_WAIT 0 -#define FUTEX_WAKE 1 -static inline void -sys_futex0(int *addr, int op, int val) +static inline long +sys_futex0(int *addr, long op, int val) { register long out0 asm ("out0") = (long) addr; register long out1 asm ("out1") = op; register long out2 asm ("out2") = val; register long out3 asm ("out3") = 0; + register long r8 asm ("r8"); + register long r10 asm ("r10"); register long r15 asm ("r15") = SYS_futex; __asm __volatile ("break 0x100000" - : "=r"(r15), "=r"(out0), "=r"(out1), "=r"(out2), "=r"(out3) + : "=r"(r15), "=r"(out0), "=r"(out1), "=r"(out2), "=r"(out3), + "=r"(r8), "=r"(r10) : "r"(r15), "r"(out0), "r"(out1), "r"(out2), "r"(out3) - : "memory", "r8", "r10", "out4", "out5", "out6", "out7", + : "memory", "out4", "out5", "out6", "out7", /* Non-stacked integer registers, minus r8, r10, r15. */ "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27", @@ -56,16 +57,41 @@ sys_futex0(int *addr, int op, int val) "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15", /* Branch registers. */ "b6"); + return r8 & r10; } static inline void futex_wait (int *addr, int val) { - sys_futex0 (addr, FUTEX_WAIT, val); + long err = sys_futex0 (addr, gomp_futex_wait, val); + if (__builtin_expect (err == ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sys_futex0 (addr, gomp_futex_wait, val); + } } static inline void futex_wake (int *addr, int count) { - sys_futex0 (addr, FUTEX_WAKE, count); + long err = sys_futex0 (addr, gomp_futex_wake, count); + if (__builtin_expect (err == ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sys_futex0 (addr, gomp_futex_wake, count); + } +} + +static inline void +cpu_relax (void) +{ + __asm volatile ("hint @pause" : : : "memory"); +} + +static inline void +atomic_write_barrier (void) +{ + __sync_synchronize (); } diff --git a/libgomp/config/linux/lock.c b/libgomp/config/linux/lock.c index 211f6007b43..a2e07d320fd 100644 --- a/libgomp/config/linux/lock.c +++ b/libgomp/config/linux/lock.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -29,47 +29,109 @@ primitives. This implementation uses atomic instructions and the futex syscall. */ -#include "libgomp.h" #include <string.h> #include <unistd.h> #include <sys/syscall.h> -#include "futex.h" +#include "wait.h" /* The internal gomp_mutex_t and the external non-recursive omp_lock_t have the same form. Re-use it. */ void -omp_init_lock (omp_lock_t *lock) +gomp_init_lock_30 (omp_lock_t *lock) { gomp_mutex_init (lock); } void -omp_destroy_lock (omp_lock_t *lock) +gomp_destroy_lock_30 (omp_lock_t *lock) { gomp_mutex_destroy (lock); } void -omp_set_lock (omp_lock_t *lock) +gomp_set_lock_30 (omp_lock_t *lock) { gomp_mutex_lock (lock); } void -omp_unset_lock (omp_lock_t *lock) +gomp_unset_lock_30 (omp_lock_t *lock) { gomp_mutex_unlock (lock); } int -omp_test_lock (omp_lock_t *lock) +gomp_test_lock_30 (omp_lock_t *lock) { return __sync_bool_compare_and_swap (lock, 0, 1); } -/* The external recursive omp_nest_lock_t form requires additional work. */ +void +gomp_init_nest_lock_30 (omp_nest_lock_t *lock) +{ + memset (lock, '\0', sizeof (*lock)); +} + +void +gomp_destroy_nest_lock_30 (omp_nest_lock_t *lock) +{ +} + +void +gomp_set_nest_lock_30 (omp_nest_lock_t *lock) +{ + void *me = gomp_icv (true); + + if (lock->owner != me) + { + gomp_mutex_lock (&lock->lock); + lock->owner = me; + } + + lock->count++; +} + +void +gomp_unset_nest_lock_30 (omp_nest_lock_t *lock) +{ + if (--lock->count == 0) + { + lock->owner = NULL; + gomp_mutex_unlock (&lock->lock); + } +} + +int +gomp_test_nest_lock_30 (omp_nest_lock_t *lock) +{ + void *me = gomp_icv (true); + + if (lock->owner == me) + return ++lock->count; + + if (__sync_bool_compare_and_swap (&lock->lock, 0, 1)) + { + lock->owner = me; + lock->count = 1; + return 1; + } + + return 0; +} + +#ifdef LIBGOMP_GNU_SYMBOL_VERSIONING +/* gomp_mutex_* can be safely locked in one thread and + unlocked in another thread, so the OpenMP 2.5 and OpenMP 3.0 + non-nested locks can be the same. */ +strong_alias (gomp_init_lock_30, gomp_init_lock_25) +strong_alias (gomp_destroy_lock_30, gomp_destroy_lock_25) +strong_alias (gomp_set_lock_30, gomp_set_lock_25) +strong_alias (gomp_unset_lock_30, gomp_unset_lock_25) +strong_alias (gomp_test_lock_30, gomp_test_lock_25) + +/* The external recursive omp_nest_lock_25_t form requires additional work. */ /* We need an integer to uniquely identify this thread. Most generally this is the thread's TID, which ideally we'd get this straight from @@ -85,17 +147,17 @@ omp_test_lock (omp_lock_t *lock) always available directly. Make do with the gomp_thread pointer since it's handy. */ -#if !defined (HAVE_TLS) +# if !defined (HAVE_TLS) static inline int gomp_tid (void) { return syscall (SYS_gettid); } -#elif !defined(__LP64__) +# elif !defined(__LP64__) static inline int gomp_tid (void) { return (int) gomp_thread (); } -#else +# else static __thread int tid_cache; static inline int gomp_tid (void) { @@ -104,22 +166,22 @@ static inline int gomp_tid (void) tid_cache = tid = syscall (SYS_gettid); return tid; } -#endif +# endif void -omp_init_nest_lock (omp_nest_lock_t *lock) +gomp_init_nest_lock_25 (omp_nest_lock_25_t *lock) { memset (lock, 0, sizeof (lock)); } void -omp_destroy_nest_lock (omp_nest_lock_t *lock) +gomp_destroy_nest_lock_25 (omp_nest_lock_25_t *lock) { } void -omp_set_nest_lock (omp_nest_lock_t *lock) +gomp_set_nest_lock_25 (omp_nest_lock_25_t *lock) { int otid, tid = gomp_tid (); @@ -137,12 +199,12 @@ omp_set_nest_lock (omp_nest_lock_t *lock) return; } - futex_wait (&lock->owner, otid); + do_wait (&lock->owner, otid); } } void -omp_unset_nest_lock (omp_nest_lock_t *lock) +gomp_unset_nest_lock_25 (omp_nest_lock_25_t *lock) { /* ??? Validate that we own the lock here. */ @@ -154,7 +216,7 @@ omp_unset_nest_lock (omp_nest_lock_t *lock) } int -omp_test_nest_lock (omp_nest_lock_t *lock) +gomp_test_nest_lock_25 (omp_nest_lock_25_t *lock) { int otid, tid = gomp_tid (); @@ -170,6 +232,19 @@ omp_test_nest_lock (omp_nest_lock_t *lock) return 0; } +omp_lock_symver (omp_init_lock) +omp_lock_symver (omp_destroy_lock) +omp_lock_symver (omp_set_lock) +omp_lock_symver (omp_unset_lock) +omp_lock_symver (omp_test_lock) +omp_lock_symver (omp_init_nest_lock) +omp_lock_symver (omp_destroy_nest_lock) +omp_lock_symver (omp_set_nest_lock) +omp_lock_symver (omp_unset_nest_lock) +omp_lock_symver (omp_test_nest_lock) + +#else + ialias (omp_init_lock) ialias (omp_init_nest_lock) ialias (omp_destroy_lock) @@ -180,3 +255,5 @@ ialias (omp_unset_lock) ialias (omp_unset_nest_lock) ialias (omp_test_lock) ialias (omp_test_nest_lock) + +#endif diff --git a/libgomp/config/linux/mutex.c b/libgomp/config/linux/mutex.c index fa3dfd1cb03..36c362eb274 100644 --- a/libgomp/config/linux/mutex.c +++ b/libgomp/config/linux/mutex.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -29,9 +29,10 @@ mechanism for libgomp. This type is private to the library. This implementation uses atomic instructions and the futex syscall. */ -#include "libgomp.h" -#include "futex.h" +#include "wait.h" +long int gomp_futex_wake = FUTEX_WAKE | FUTEX_PRIVATE_FLAG; +long int gomp_futex_wait = FUTEX_WAIT | FUTEX_PRIVATE_FLAG; void gomp_mutex_lock_slow (gomp_mutex_t *mutex) @@ -40,7 +41,7 @@ gomp_mutex_lock_slow (gomp_mutex_t *mutex) { int oldval = __sync_val_compare_and_swap (mutex, 1, 2); if (oldval != 0) - futex_wait (mutex, 2); + do_wait (mutex, 2); } while (!__sync_bool_compare_and_swap (mutex, 0, 2)); } diff --git a/libgomp/config/linux/omp-lock.h b/libgomp/config/linux/omp-lock.h index 350cba16056..e65aff7fce7 100644 --- a/libgomp/config/linux/omp-lock.h +++ b/libgomp/config/linux/omp-lock.h @@ -3,8 +3,10 @@ structures without polluting the namespace. When using the Linux futex primitive, non-recursive locks require - only one int. Recursive locks require we identify the owning thread - and so require two ints. */ + only one int. Recursive locks require we identify the owning task + and so require one int and a pointer. */ typedef int omp_lock_t; -typedef struct { int owner, count; } omp_nest_lock_t; +typedef struct { int lock, count; void *owner; } omp_nest_lock_t; +typedef int omp_lock_25_t; +typedef struct { int owner, count; } omp_nest_lock_25_t; diff --git a/libgomp/config/linux/powerpc/futex.h b/libgomp/config/linux/powerpc/futex.h index 20e03573783..c1e0d0f5651 100644 --- a/libgomp/config/linux/powerpc/futex.h +++ b/libgomp/config/linux/powerpc/futex.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -28,10 +28,8 @@ /* Provide target-specific access to the futex system call. */ #include <sys/syscall.h> -#define FUTEX_WAIT 0 -#define FUTEX_WAKE 1 -static inline void +static inline long sys_futex0 (int *addr, int op, int val) { register long int r0 __asm__ ("r0"); @@ -50,21 +48,48 @@ sys_futex0 (int *addr, int op, int val) doesn't. It doesn't much matter for us. In the interest of unity, go ahead and clobber it always. */ - __asm volatile ("sc" + __asm volatile ("sc; mfcr %0" : "=r"(r0), "=r"(r3), "=r"(r4), "=r"(r5), "=r"(r6) : "r"(r0), "r"(r3), "r"(r4), "r"(r5), "r"(r6) : "r7", "r8", "r9", "r10", "r11", "r12", "cr0", "ctr", "memory"); + if (__builtin_expect (r0 & (1 << 28), 0)) + return r3; + return 0; } static inline void futex_wait (int *addr, int val) { - sys_futex0 (addr, FUTEX_WAIT, val); + long err = sys_futex0 (addr, gomp_futex_wait, val); + if (__builtin_expect (err == ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sys_futex0 (addr, gomp_futex_wait, val); + } } static inline void futex_wake (int *addr, int count) { - sys_futex0 (addr, FUTEX_WAKE, count); + long err = sys_futex0 (addr, gomp_futex_wake, count); + if (__builtin_expect (err == ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sys_futex0 (addr, gomp_futex_wake, count); + } +} + +static inline void +cpu_relax (void) +{ + __asm volatile ("" : : : "memory"); +} + +static inline void +atomic_write_barrier (void) +{ + __asm volatile ("eieio" : : : "memory"); } diff --git a/libgomp/config/linux/proc.c b/libgomp/config/linux/proc.c index 2267cfbd2d1..6a006f24aa3 100644 --- a/libgomp/config/linux/proc.c +++ b/libgomp/config/linux/proc.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005, 2006, 2007 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2006, 2007, 2008 Free Software Foundation, Inc. Contributed by Jakub Jelinek <jakub@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -78,14 +78,14 @@ gomp_init_num_threads (void) if (pthread_getaffinity_np (pthread_self (), sizeof (cpuset), &cpuset) == 0) { /* Count only the CPUs this process can use. */ - gomp_nthreads_var = cpuset_popcount (&cpuset); - if (gomp_nthreads_var == 0) - gomp_nthreads_var = 1; + gomp_global_icv.nthreads_var = cpuset_popcount (&cpuset); + if (gomp_global_icv.nthreads_var == 0) + gomp_global_icv.nthreads_var = 1; return; } #endif #ifdef _SC_NPROCESSORS_ONLN - gomp_nthreads_var = sysconf (_SC_NPROCESSORS_ONLN); + gomp_global_icv.nthreads_var = sysconf (_SC_NPROCESSORS_ONLN); #endif } @@ -132,7 +132,7 @@ get_num_procs (void) #ifdef _SC_NPROCESSORS_ONLN return sysconf (_SC_NPROCESSORS_ONLN); #else - return gomp_nthreads_var; + return gomp_icv (false)->nthreads_var; #endif } @@ -146,11 +146,11 @@ get_num_procs (void) unsigned gomp_dynamic_max_threads (void) { - unsigned n_onln, loadavg; + unsigned n_onln, loadavg, nthreads_var = gomp_icv (false)->nthreads_var; n_onln = get_num_procs (); - if (n_onln > gomp_nthreads_var) - n_onln = gomp_nthreads_var; + if (n_onln > nthreads_var) + n_onln = nthreads_var; loadavg = 0; #ifdef HAVE_GETLOADAVG diff --git a/libgomp/config/linux/ptrlock.c b/libgomp/config/linux/ptrlock.c new file mode 100644 index 00000000000..8faa1b2287d --- /dev/null +++ b/libgomp/config/linux/ptrlock.c @@ -0,0 +1,70 @@ +/* Copyright (C) 2008 Free Software Foundation, Inc. + Contributed by Jakub Jelinek <jakub@redhat.com>. + + This file is part of the GNU OpenMP Library (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU Lesser General Public License as published by + the Free Software Foundation; either version 2.1 of the License, or + (at your option) any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for + more details. + + You should have received a copy of the GNU Lesser General Public License + along with libgomp; see the file COPYING.LIB. If not, write to the + Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, + MA 02110-1301, USA. */ + +/* As a special exception, if you link this library with other files, some + of which are compiled with GCC, to produce an executable, this library + does not by itself cause the resulting executable to be covered by the + GNU General Public License. This exception does not however invalidate + any other reasons why the executable file might be covered by the GNU + General Public License. */ + +/* This is a Linux specific implementation of a mutex synchronization + mechanism for libgomp. This type is private to the library. This + implementation uses atomic instructions and the futex syscall. */ + +#include <endian.h> +#include <limits.h> +#include "wait.h" + +void * +gomp_ptrlock_get_slow (gomp_ptrlock_t *ptrlock) +{ + int *intptr; + __sync_bool_compare_and_swap (ptrlock, 1, 2); + + /* futex works on ints, not pointers. + But a valid work share pointer will be at least + 8 byte aligned, so it is safe to assume the low + 32-bits of the pointer won't contain values 1 or 2. */ + __asm volatile ("" : "=r" (intptr) : "0" (ptrlock)); +#if __BYTE_ORDER == __BIG_ENDIAN + if (sizeof (*ptrlock) > sizeof (int)) + intptr += (sizeof (*ptrlock) / sizeof (int)) - 1; +#endif + do + do_wait (intptr, 2); + while (*intptr == 2); + __asm volatile ("" : : : "memory"); + return *ptrlock; +} + +void +gomp_ptrlock_set_slow (gomp_ptrlock_t *ptrlock, void *ptr) +{ + int *intptr; + + *ptrlock = ptr; + __asm volatile ("" : "=r" (intptr) : "0" (ptrlock)); +#if __BYTE_ORDER == __BIG_ENDIAN + if (sizeof (*ptrlock) > sizeof (int)) + intptr += (sizeof (*ptrlock) / sizeof (int)) - 1; +#endif + futex_wake (intptr, INT_MAX); +} diff --git a/libgomp/config/linux/ptrlock.h b/libgomp/config/linux/ptrlock.h new file mode 100644 index 00000000000..bb5441676a4 --- /dev/null +++ b/libgomp/config/linux/ptrlock.h @@ -0,0 +1,65 @@ +/* Copyright (C) 2008 Free Software Foundation, Inc. + Contributed by Jakub Jelinek <jakub@redhat.com>. + + This file is part of the GNU OpenMP Library (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU Lesser General Public License as published by + the Free Software Foundation; either version 2.1 of the License, or + (at your option) any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for + more details. + + You should have received a copy of the GNU Lesser General Public License + along with libgomp; see the file COPYING.LIB. If not, write to the + Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, + MA 02110-1301, USA. */ + +/* As a special exception, if you link this library with other files, some + of which are compiled with GCC, to produce an executable, this library + does not by itself cause the resulting executable to be covered by the + GNU General Public License. This exception does not however invalidate + any other reasons why the executable file might be covered by the GNU + General Public License. */ + +/* This is a Linux specific implementation of a mutex synchronization + mechanism for libgomp. This type is private to the library. This + implementation uses atomic instructions and the futex syscall. */ + +#ifndef GOMP_PTRLOCK_H +#define GOMP_PTRLOCK_H 1 + +typedef void *gomp_ptrlock_t; + +static inline void gomp_ptrlock_init (gomp_ptrlock_t *ptrlock, void *ptr) +{ + *ptrlock = ptr; +} + +extern void *gomp_ptrlock_get_slow (gomp_ptrlock_t *ptrlock); +static inline void *gomp_ptrlock_get (gomp_ptrlock_t *ptrlock) +{ + if ((uintptr_t) *ptrlock > 2) + return *ptrlock; + + if (__sync_bool_compare_and_swap (ptrlock, NULL, (uintptr_t) 1)) + return NULL; + + return gomp_ptrlock_get_slow (ptrlock); +} + +extern void gomp_ptrlock_set_slow (gomp_ptrlock_t *ptrlock, void *ptr); +static inline void gomp_ptrlock_set (gomp_ptrlock_t *ptrlock, void *ptr) +{ + if (!__sync_bool_compare_and_swap (ptrlock, (uintptr_t) 1, ptr)) + gomp_ptrlock_set_slow (ptrlock, ptr); +} + +static inline void gomp_ptrlock_destroy (gomp_ptrlock_t *ptrlock) +{ +} + +#endif /* GOMP_PTRLOCK_H */ diff --git a/libgomp/config/linux/s390/futex.h b/libgomp/config/linux/s390/futex.h index 9b3820c0d97..3c4e1fc5d25 100644 --- a/libgomp/config/linux/s390/futex.h +++ b/libgomp/config/linux/s390/futex.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Jakub Jelinek <jakub@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -28,10 +28,8 @@ /* Provide target-specific access to the futex system call. */ #include <sys/syscall.h> -#define FUTEX_WAIT 0 -#define FUTEX_WAKE 1 -static inline void +static inline long sys_futex0 (int *addr, int op, int val) { register long int gpr2 __asm__ ("2"); @@ -49,16 +47,41 @@ sys_futex0 (int *addr, int op, int val) : "i" (SYS_futex), "0" (gpr2), "d" (gpr3), "d" (gpr4), "d" (gpr5) : "memory"); + return gpr2; } static inline void futex_wait (int *addr, int val) { - sys_futex0 (addr, FUTEX_WAIT, val); + long err = sys_futex0 (addr, gomp_futex_wait, val); + if (__builtin_expect (err == -ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sys_futex0 (addr, gomp_futex_wait, val); + } } static inline void futex_wake (int *addr, int count) { - sys_futex0 (addr, FUTEX_WAKE, count); + long err = sys_futex0 (addr, gomp_futex_wake, count); + if (__builtin_expect (err == -ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sys_futex0 (addr, gomp_futex_wake, count); + } +} + +static inline void +cpu_relax (void) +{ + __asm volatile ("" : : : "memory"); +} + +static inline void +atomic_write_barrier (void) +{ + __sync_synchronize (); } diff --git a/libgomp/config/linux/sem.c b/libgomp/config/linux/sem.c index 798e3f1f2c0..5615bc580ee 100644 --- a/libgomp/config/linux/sem.c +++ b/libgomp/config/linux/sem.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -29,8 +29,7 @@ mechanism for libgomp. This type is private to the library. This implementation uses atomic instructions and the futex syscall. */ -#include "libgomp.h" -#include "futex.h" +#include "wait.h" void @@ -44,7 +43,7 @@ gomp_sem_wait_slow (gomp_sem_t *sem) if (__sync_bool_compare_and_swap (sem, val, val - 1)) return; } - futex_wait (sem, -1); + do_wait (sem, -1); } } diff --git a/libgomp/config/linux/sparc/futex.h b/libgomp/config/linux/sparc/futex.h index 7b1cc837956..b9bc387355f 100644 --- a/libgomp/config/linux/sparc/futex.h +++ b/libgomp/config/linux/sparc/futex.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Jakub Jelinek <jakub@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -28,10 +28,8 @@ /* Provide target-specific access to the futex system call. */ #include <sys/syscall.h> -#define FUTEX_WAIT 0 -#define FUTEX_WAKE 1 -static inline void +static inline long sys_futex0 (int *addr, int op, int val) { register long int g1 __asm__ ("g1"); @@ -47,9 +45,9 @@ sys_futex0 (int *addr, int op, int val) o3 = 0; #ifdef __arch64__ -# define SYSCALL_STRING "ta\t0x6d" +# define SYSCALL_STRING "ta\t0x6d; bcs,a,pt %%xcc, 1f; sub %%g0, %%o0, %%o0; 1:" #else -# define SYSCALL_STRING "ta\t0x10" +# define SYSCALL_STRING "ta\t0x10; bcs,a 1f; sub %%g0, %%o0, %%o0; 1:" #endif __asm volatile (SYSCALL_STRING @@ -65,16 +63,49 @@ sys_futex0 (int *addr, int op, int val) "f48", "f50", "f52", "f54", "f56", "f58", "f60", "f62", #endif "cc", "memory"); + return o0; } static inline void futex_wait (int *addr, int val) { - sys_futex0 (addr, FUTEX_WAIT, val); + long err = sys_futex0 (addr, gomp_futex_wait, val); + if (__builtin_expect (err == ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sys_futex0 (addr, gomp_futex_wait, val); + } } static inline void futex_wake (int *addr, int count) { - sys_futex0 (addr, FUTEX_WAKE, count); + long err = sys_futex0 (addr, gomp_futex_wake, count); + if (__builtin_expect (err == ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sys_futex0 (addr, gomp_futex_wake, count); + } +} + +static inline void +cpu_relax (void) +{ +#if defined __arch64__ || defined __sparc_v9__ + __asm volatile ("membar #LoadLoad" : : : "memory"); +#else + __asm volatile ("" : : : "memory"); +#endif +} + +static inline void +atomic_write_barrier (void) +{ +#if defined __arch64__ || defined __sparc_v9__ + __asm volatile ("membar #StoreStore" : : : "memory"); +#else + __sync_synchronize (); +#endif } diff --git a/libgomp/config/linux/wait.h b/libgomp/config/linux/wait.h new file mode 100644 index 00000000000..21f0ab8756c --- /dev/null +++ b/libgomp/config/linux/wait.h @@ -0,0 +1,68 @@ +/* Copyright (C) 2008 Free Software Foundation, Inc. + Contributed by Jakub Jelinek <jakub@redhat.com>. + + This file is part of the GNU OpenMP Library (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU Lesser General Public License as published by + the Free Software Foundation; either version 2.1 of the License, or + (at your option) any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for + more details. + + You should have received a copy of the GNU Lesser General Public License + along with libgomp; see the file COPYING.LIB. If not, write to the + Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, + MA 02110-1301, USA. */ + +/* As a special exception, if you link this library with other files, some + of which are compiled with GCC, to produce an executable, this library + does not by itself cause the resulting executable to be covered by the + GNU General Public License. This exception does not however invalidate + any other reasons why the executable file might be covered by the GNU + General Public License. */ + +/* This is a Linux specific implementation of a mutex synchronization + mechanism for libgomp. This type is private to the library. This + implementation uses atomic instructions and the futex syscall. */ + +#ifndef GOMP_WAIT_H +#define GOMP_WAIT_H 1 + +#include "libgomp.h" +#include <errno.h> + +#define FUTEX_WAIT 0 +#define FUTEX_WAKE 1 +#define FUTEX_PRIVATE_FLAG 128L + +#ifdef HAVE_ATTRIBUTE_VISIBILITY +# pragma GCC visibility push(hidden) +#endif + +extern long int gomp_futex_wait, gomp_futex_wake; + +#include "futex.h" + +static inline void do_wait (int *addr, int val) +{ + unsigned long long i, count = gomp_spin_count_var; + + if (__builtin_expect (gomp_managed_threads > gomp_available_cpus, 0)) + count = gomp_throttled_spin_count_var; + for (i = 0; i < count; i++) + if (__builtin_expect (*addr != val, 0)) + return; + else + cpu_relax (); + futex_wait (addr, val); +} + +#ifdef HAVE_ATTRIBUTE_VISIBILITY +# pragma GCC visibility pop +#endif + +#endif /* GOMP_WAIT_H */ diff --git a/libgomp/config/linux/x86/futex.h b/libgomp/config/linux/x86/futex.h index 4f9aac2ddbb..36af6aad93b 100644 --- a/libgomp/config/linux/x86/futex.h +++ b/libgomp/config/linux/x86/futex.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -27,9 +27,6 @@ /* Provide target-specific access to the futex system call. */ -#define FUTEX_WAIT 0 -#define FUTEX_WAKE 1 - #ifdef __LP64__ # ifndef SYS_futex # define SYS_futex 202 @@ -38,14 +35,26 @@ static inline void futex_wait (int *addr, int val) { - register long r10 __asm__("%r10") = 0; + register long r10 __asm__("%r10"); long res; + r10 = 0; __asm volatile ("syscall" : "=a" (res) - : "0"(SYS_futex), "D" (addr), "S"(FUTEX_WAIT), - "d"(val), "r"(r10) + : "0" (SYS_futex), "D" (addr), "S" (gomp_futex_wait), + "d" (val), "r" (r10) : "r11", "rcx", "memory"); + if (__builtin_expect (res == -ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + r10 = 0; + __asm volatile ("syscall" + : "=a" (res) + : "0" (SYS_futex), "D" (addr), "S" (gomp_futex_wait), + "d" (val), "r" (r10) + : "r11", "rcx", "memory"); + } } static inline void @@ -55,8 +64,19 @@ futex_wake (int *addr, int count) __asm volatile ("syscall" : "=a" (res) - : "0"(SYS_futex), "D" (addr), "S"(FUTEX_WAKE), "d"(count) + : "0" (SYS_futex), "D" (addr), "S" (gomp_futex_wake), + "d" (count) : "r11", "rcx", "memory"); + if (__builtin_expect (res == -ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + __asm volatile ("syscall" + : "=a" (res) + : "0" (SYS_futex), "D" (addr), "S" (gomp_futex_wake), + "d" (count) + : "r11", "rcx", "memory"); + } } #else # ifndef SYS_futex @@ -65,7 +85,7 @@ futex_wake (int *addr, int count) # ifdef __PIC__ -static inline void +static inline long sys_futex0 (int *addr, int op, int val) { long res; @@ -77,11 +97,12 @@ sys_futex0 (int *addr, int op, int val) : "0"(SYS_futex), "r" (addr), "c"(op), "d"(val), "S"(0) : "memory"); + return res; } # else -static inline void +static inline long sys_futex0 (int *addr, int op, int val) { long res; @@ -91,6 +112,7 @@ sys_futex0 (int *addr, int op, int val) : "0"(SYS_futex), "b" (addr), "c"(op), "d"(val), "S"(0) : "memory"); + return res; } # endif /* __PIC__ */ @@ -98,13 +120,37 @@ sys_futex0 (int *addr, int op, int val) static inline void futex_wait (int *addr, int val) { - sys_futex0 (addr, FUTEX_WAIT, val); + long res = sys_futex0 (addr, gomp_futex_wait, val); + if (__builtin_expect (res == -ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sys_futex0 (addr, gomp_futex_wait, val); + } } static inline void futex_wake (int *addr, int count) { - sys_futex0 (addr, FUTEX_WAKE, count); + long res = sys_futex0 (addr, gomp_futex_wake, count); + if (__builtin_expect (res == -ENOSYS, 0)) + { + gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG; + gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG; + sys_futex0 (addr, gomp_futex_wake, count); + } } #endif /* __LP64__ */ + +static inline void +cpu_relax (void) +{ + __asm volatile ("rep; nop" : : : "memory"); +} + +static inline void +atomic_write_barrier (void) +{ + __sync_synchronize (); +} diff --git a/libgomp/config/mingw32/proc.c b/libgomp/config/mingw32/proc.c index def7bb5e8f4..4532f45f572 100644 --- a/libgomp/config/mingw32/proc.c +++ b/libgomp/config/mingw32/proc.c @@ -35,7 +35,7 @@ #include <windows.h> /* Count the CPU's currently available to this process. */ -static int +static unsigned int count_avail_process_cpus () { DWORD_PTR process_cpus; @@ -59,7 +59,7 @@ count_avail_process_cpus () void gomp_init_num_threads (void) { - gomp_nthreads_var = count_avail_process_cpus (); + gomp_global_icv.nthreads_var = count_avail_process_cpus (); } /* When OMP_DYNAMIC is set, at thread launch determine the number of @@ -69,8 +69,9 @@ gomp_init_num_threads (void) unsigned gomp_dynamic_max_threads (void) { - int n_onln = count_avail_process_cpus (); - return n_onln > gomp_nthreads_var ? gomp_nthreads_var : n_onln; + unsigned int n_onln = count_avail_process_cpus (); + unsigned int nthreads_var = gomp_icv (false)->nthreads_var; + return n_onln > nthreads_var ? nthreads_var : n_onln; } int diff --git a/libgomp/config/posix/bar.c b/libgomp/config/posix/bar.c index 79721610ca7..ff19e9353a3 100644 --- a/libgomp/config/posix/bar.c +++ b/libgomp/config/posix/bar.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -44,6 +44,7 @@ gomp_barrier_init (gomp_barrier_t *bar, unsigned count) gomp_sem_init (&bar->sem2, 0); bar->total = count; bar->arrived = 0; + bar->generation = 0; } void @@ -70,11 +71,11 @@ gomp_barrier_reinit (gomp_barrier_t *bar, unsigned count) } void -gomp_barrier_wait_end (gomp_barrier_t *bar, bool last) +gomp_barrier_wait_end (gomp_barrier_t *bar, gomp_barrier_state_t state) { unsigned int n; - if (last) + if (state & 1) { n = --bar->arrived; if (n > 0) @@ -109,3 +110,72 @@ gomp_barrier_wait (gomp_barrier_t *barrier) { gomp_barrier_wait_end (barrier, gomp_barrier_wait_start (barrier)); } + +void +gomp_team_barrier_wait_end (gomp_barrier_t *bar, gomp_barrier_state_t state) +{ + unsigned int n; + + if (state & 1) + { + n = --bar->arrived; + struct gomp_thread *thr = gomp_thread (); + struct gomp_team *team = thr->ts.team; + + if (team->task_count) + { + gomp_barrier_handle_tasks (state); + if (n > 0) + gomp_sem_wait (&bar->sem2); + gomp_mutex_unlock (&bar->mutex1); + return; + } + + bar->generation = state + 3; + if (n > 0) + { + do + gomp_sem_post (&bar->sem1); + while (--n != 0); + gomp_sem_wait (&bar->sem2); + } + gomp_mutex_unlock (&bar->mutex1); + } + else + { + gomp_mutex_unlock (&bar->mutex1); + do + { + gomp_sem_wait (&bar->sem1); + if (bar->generation & 1) + gomp_barrier_handle_tasks (state); + } + while (bar->generation != state + 4); + +#ifdef HAVE_SYNC_BUILTINS + n = __sync_add_and_fetch (&bar->arrived, -1); +#else + gomp_mutex_lock (&bar->mutex2); + n = --bar->arrived; + gomp_mutex_unlock (&bar->mutex2); +#endif + + if (n == 0) + gomp_sem_post (&bar->sem2); + } +} + +void +gomp_team_barrier_wait (gomp_barrier_t *barrier) +{ + gomp_team_barrier_wait_end (barrier, gomp_barrier_wait_start (barrier)); +} + +void +gomp_team_barrier_wake (gomp_barrier_t *bar, int count) +{ + if (count == 0) + count = bar->total - 1; + while (count-- > 0) + gomp_sem_post (&bar->sem1); +} diff --git a/libgomp/config/posix/bar.h b/libgomp/config/posix/bar.h index 5275efa96a7..e4d2e680af2 100644 --- a/libgomp/config/posix/bar.h +++ b/libgomp/config/posix/bar.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -45,19 +45,74 @@ typedef struct gomp_sem_t sem2; unsigned total; unsigned arrived; + unsigned generation; } gomp_barrier_t; +typedef unsigned int gomp_barrier_state_t; extern void gomp_barrier_init (gomp_barrier_t *, unsigned); extern void gomp_barrier_reinit (gomp_barrier_t *, unsigned); extern void gomp_barrier_destroy (gomp_barrier_t *); extern void gomp_barrier_wait (gomp_barrier_t *); -extern void gomp_barrier_wait_end (gomp_barrier_t *, bool); +extern void gomp_barrier_wait_end (gomp_barrier_t *, gomp_barrier_state_t); +extern void gomp_team_barrier_wait (gomp_barrier_t *); +extern void gomp_team_barrier_wait_end (gomp_barrier_t *, + gomp_barrier_state_t); +extern void gomp_team_barrier_wake (gomp_barrier_t *, int); -static inline bool gomp_barrier_wait_start (gomp_barrier_t *bar) +static inline gomp_barrier_state_t +gomp_barrier_wait_start (gomp_barrier_t *bar) { + unsigned int ret; gomp_mutex_lock (&bar->mutex1); - return ++bar->arrived == bar->total; + ret = bar->generation & ~3; + ret += ++bar->arrived == bar->total; + return ret; +} + +static inline bool +gomp_barrier_last_thread (gomp_barrier_state_t state) +{ + return state & 1; +} + +static inline void +gomp_barrier_wait_last (gomp_barrier_t *bar) +{ + gomp_barrier_wait (bar); +} + +/* All the inlines below must be called with team->task_lock + held. */ + +static inline void +gomp_team_barrier_set_task_pending (gomp_barrier_t *bar) +{ + bar->generation |= 1; +} + +static inline void +gomp_team_barrier_clear_task_pending (gomp_barrier_t *bar) +{ + bar->generation &= ~1; +} + +static inline void +gomp_team_barrier_set_waiting_for_tasks (gomp_barrier_t *bar) +{ + bar->generation |= 2; +} + +static inline bool +gomp_team_barrier_waiting_for_tasks (gomp_barrier_t *bar) +{ + return (bar->generation & 2) != 0; +} + +static inline void +gomp_team_barrier_done (gomp_barrier_t *bar, gomp_barrier_state_t state) +{ + bar->generation = (state & ~3) + 4; } #endif /* GOMP_BARRIER_H */ diff --git a/libgomp/config/posix/lock.c b/libgomp/config/posix/lock.c index 59459bb86ce..c2868398c66 100644 --- a/libgomp/config/posix/lock.c +++ b/libgomp/config/posix/lock.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -42,39 +42,209 @@ #include "libgomp.h" +#ifdef HAVE_BROKEN_POSIX_SEMAPHORES +void +gomp_init_lock_30 (omp_lock_t *lock) +{ + pthread_mutex_init (lock, NULL); +} + +void +gomp_destroy_lock_30 (omp_lock_t *lock) +{ + pthread_mutex_destroy (lock); +} + +void +gomp_set_lock_30 (omp_lock_t *lock) +{ + pthread_mutex_lock (lock); +} + +void +gomp_unset_lock_30 (omp_lock_t *lock) +{ + pthread_mutex_unlock (lock); +} + +int +gomp_test_lock_30 (omp_lock_t *lock) +{ + return pthread_mutex_trylock (lock) == 0; +} + +void +gomp_init_nest_lock_30 (omp_nest_lock_t *lock) +{ + pthread_mutex_init (&lock->lock, NULL); + lock->count = 0; + lock->owner = NULL; +} + +void +gomp_destroy_nest_lock_30 (omp_nest_lock_t *lock) +{ + pthread_mutex_destroy (&lock->lock); +} + +void +gomp_set_nest_lock_30 (omp_nest_lock_t *lock) +{ + void *me = gomp_icv (true); + + if (lock->owner != me) + { + pthread_mutex_lock (&lock->lock); + lock->owner = me; + } + lock->count++; +} + +void +gomp_unset_nest_lock_30 (omp_nest_lock_t *lock) +{ + if (--lock->count == 0) + { + lock->owner = NULL; + pthread_mutex_unlock (&lock->lock); + } +} + +int +gomp_test_nest_lock_30 (omp_nest_lock_t *lock) +{ + void *me = gomp_icv (true); + + if (lock->owner != me) + { + if (pthread_mutex_trylock (&lock->lock) != 0) + return 0; + lock->owner = me; + } + + return ++lock->count; +} + +#else void -omp_init_lock (omp_lock_t *lock) +gomp_init_lock_30 (omp_lock_t *lock) +{ + sem_init (lock, 0, 1); +} + +void +gomp_destroy_lock_30 (omp_lock_t *lock) +{ + sem_destroy (lock); +} + +void +gomp_set_lock_30 (omp_lock_t *lock) +{ + while (sem_wait (lock) != 0) + ; +} + +void +gomp_unset_lock_30 (omp_lock_t *lock) +{ + sem_post (lock); +} + +int +gomp_test_lock_30 (omp_lock_t *lock) +{ + return sem_trywait (lock) == 0; +} + +void +gomp_init_nest_lock_30 (omp_nest_lock_t *lock) +{ + sem_init (&lock->lock, 0, 1); + lock->count = 0; + lock->owner = NULL; +} + +void +gomp_destroy_nest_lock_30 (omp_nest_lock_t *lock) +{ + sem_destroy (&lock->lock); +} + +void +gomp_set_nest_lock_30 (omp_nest_lock_t *lock) +{ + void *me = gomp_icv (true); + + if (lock->owner != me) + { + while (sem_wait (&lock->lock) != 0) + ; + lock->owner = me; + } + lock->count++; +} + +void +gomp_unset_nest_lock_30 (omp_nest_lock_t *lock) +{ + if (--lock->count == 0) + { + lock->owner = NULL; + sem_post (&lock->lock); + } +} + +int +gomp_test_nest_lock_30 (omp_nest_lock_t *lock) +{ + void *me = gomp_icv (true); + + if (lock->owner != me) + { + if (sem_trywait (&lock->lock) != 0) + return 0; + lock->owner = me; + } + + return ++lock->count; +} +#endif + +#ifdef LIBGOMP_GNU_SYMBOL_VERSIONING +void +gomp_init_lock_25 (omp_lock_25_t *lock) { pthread_mutex_init (lock, NULL); } void -omp_destroy_lock (omp_lock_t *lock) +gomp_destroy_lock_25 (omp_lock_25_t *lock) { pthread_mutex_destroy (lock); } void -omp_set_lock (omp_lock_t *lock) +gomp_set_lock_25 (omp_lock_25_t *lock) { pthread_mutex_lock (lock); } void -omp_unset_lock (omp_lock_t *lock) +gomp_unset_lock_25 (omp_lock_25_t *lock) { pthread_mutex_unlock (lock); } int -omp_test_lock (omp_lock_t *lock) +gomp_test_lock_25 (omp_lock_25_t *lock) { return pthread_mutex_trylock (lock) == 0; } void -omp_init_nest_lock (omp_nest_lock_t *lock) +gomp_init_nest_lock_25 (omp_nest_lock_25_t *lock) { pthread_mutexattr_t attr; @@ -86,33 +256,46 @@ omp_init_nest_lock (omp_nest_lock_t *lock) } void -omp_destroy_nest_lock (omp_nest_lock_t *lock) +gomp_destroy_nest_lock_25 (omp_nest_lock_25_t *lock) { pthread_mutex_destroy (&lock->lock); } void -omp_set_nest_lock (omp_nest_lock_t *lock) +gomp_set_nest_lock_25 (omp_nest_lock_25_t *lock) { pthread_mutex_lock (&lock->lock); lock->count++; } void -omp_unset_nest_lock (omp_nest_lock_t *lock) +gomp_unset_nest_lock_25 (omp_nest_lock_25_t *lock) { lock->count--; pthread_mutex_unlock (&lock->lock); } int -omp_test_nest_lock (omp_nest_lock_t *lock) +gomp_test_nest_lock_25 (omp_nest_lock_25_t *lock) { if (pthread_mutex_trylock (&lock->lock) == 0) return ++lock->count; return 0; } +omp_lock_symver (omp_init_lock) +omp_lock_symver (omp_destroy_lock) +omp_lock_symver (omp_set_lock) +omp_lock_symver (omp_unset_lock) +omp_lock_symver (omp_test_lock) +omp_lock_symver (omp_init_nest_lock) +omp_lock_symver (omp_destroy_nest_lock) +omp_lock_symver (omp_set_nest_lock) +omp_lock_symver (omp_unset_nest_lock) +omp_lock_symver (omp_test_nest_lock) + +#else + ialias (omp_init_lock) ialias (omp_init_nest_lock) ialias (omp_destroy_lock) @@ -123,3 +306,5 @@ ialias (omp_unset_lock) ialias (omp_unset_nest_lock) ialias (omp_test_lock) ialias (omp_test_nest_lock) + +#endif diff --git a/libgomp/config/posix/omp-lock.h b/libgomp/config/posix/omp-lock.h index ed70618d87d..e51dc271f8a 100644 --- a/libgomp/config/posix/omp-lock.h +++ b/libgomp/config/posix/omp-lock.h @@ -2,10 +2,22 @@ alignment of the public OpenMP locks, so that we can export data structures without polluting the namespace. - In this default POSIX implementation, we map the two locks to the - same PTHREADS primitive. */ + In this default POSIX implementation, we used to map the two locks to the + same PTHREADS primitive, but for OpenMP 3.0 sem_t needs to be used + instead, as pthread_mutex_unlock should not be called by different + thread than the one that called pthread_mutex_lock. */ #include <pthread.h> +#include <semaphore.h> +typedef pthread_mutex_t omp_lock_25_t; +typedef struct { pthread_mutex_t lock; int count; } omp_nest_lock_25_t; +#ifdef HAVE_BROKEN_POSIX_SEMAPHORES +/* If we don't have working semaphores, we'll make all explicit tasks + tied to the creating thread. */ typedef pthread_mutex_t omp_lock_t; -typedef struct { pthread_mutex_t lock; int count; } omp_nest_lock_t; +typedef struct { pthread_mutex_t lock; int count; void *owner; } omp_nest_lock_t; +#else +typedef sem_t omp_lock_t; +typedef struct { sem_t lock; int count; void *owner; } omp_nest_lock_t; +#endif diff --git a/libgomp/config/posix/proc.c b/libgomp/config/posix/proc.c index 3ee84f5c9d6..0c1096fb6b1 100644 --- a/libgomp/config/posix/proc.c +++ b/libgomp/config/posix/proc.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005, 2006 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -48,7 +48,7 @@ void gomp_init_num_threads (void) { #ifdef _SC_NPROCESSORS_ONLN - gomp_nthreads_var = sysconf (_SC_NPROCESSORS_ONLN); + gomp_global_icv.nthreads_var = sysconf (_SC_NPROCESSORS_ONLN); #endif } @@ -63,13 +63,14 @@ unsigned gomp_dynamic_max_threads (void) { unsigned n_onln, loadavg; + unsigned nthreads_var = gomp_icv (false)->nthreads_var; #ifdef _SC_NPROCESSORS_ONLN n_onln = sysconf (_SC_NPROCESSORS_ONLN); - if (n_onln > gomp_nthreads_var) - n_onln = gomp_nthreads_var; + if (n_onln > nthreads_var) + n_onln = nthreads_var; #else - n_onln = gomp_nthreads_var; + n_onln = nthreads_var; #endif loadavg = 0; @@ -96,7 +97,7 @@ omp_get_num_procs (void) #ifdef _SC_NPROCESSORS_ONLN return sysconf (_SC_NPROCESSORS_ONLN); #else - return gomp_nthreads_var; + return gomp_icv (false)->nthreads_var; #endif } diff --git a/libgomp/config/posix/ptrlock.c b/libgomp/config/posix/ptrlock.c new file mode 100644 index 00000000000..39bb64da0f9 --- /dev/null +++ b/libgomp/config/posix/ptrlock.c @@ -0,0 +1 @@ +/* Everything is in the header. */ diff --git a/libgomp/config/posix/ptrlock.h b/libgomp/config/posix/ptrlock.h new file mode 100644 index 00000000000..1271ebb227b --- /dev/null +++ b/libgomp/config/posix/ptrlock.h @@ -0,0 +1,69 @@ +/* Copyright (C) 2008 Free Software Foundation, Inc. + Contributed by Jakub Jelinek <jakub@redhat.com>. + + This file is part of the GNU OpenMP Library (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU Lesser General Public License as published by + the Free Software Foundation; either version 2.1 of the License, or + (at your option) any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for + more details. + + You should have received a copy of the GNU Lesser General Public License + along with libgomp; see the file COPYING.LIB. If not, write to the + Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, + MA 02110-1301, USA. */ + +/* As a special exception, if you link this library with other files, some + of which are compiled with GCC, to produce an executable, this library + does not by itself cause the resulting executable to be covered by the + GNU General Public License. This exception does not however invalidate + any other reasons why the executable file might be covered by the GNU + General Public License. */ + +/* This is a Linux specific implementation of a mutex synchronization + mechanism for libgomp. This type is private to the library. This + implementation uses atomic instructions and the futex syscall. */ + +#ifndef GOMP_PTRLOCK_H +#define GOMP_PTRLOCK_H 1 + +typedef struct { void *ptr; gomp_mutex_t lock; } gomp_ptrlock_t; + +static inline void gomp_ptrlock_init (gomp_ptrlock_t *ptrlock, void *ptr) +{ + ptrlock->ptr = ptr; + gomp_mutex_init (&ptrlock->lock); +} + +static inline void *gomp_ptrlock_get (gomp_ptrlock_t *ptrlock) +{ + if (ptrlock->ptr != NULL) + return ptrlock->ptr; + + gomp_mutex_lock (&ptrlock->lock); + if (ptrlock->ptr != NULL) + { + gomp_mutex_unlock (&ptrlock->lock); + return ptrlock->ptr; + } + + return NULL; +} + +static inline void gomp_ptrlock_set (gomp_ptrlock_t *ptrlock, void *ptr) +{ + ptrlock->ptr = ptr; + gomp_mutex_unlock (&ptrlock->lock); +} + +static inline void gomp_ptrlock_destroy (gomp_ptrlock_t *ptrlock) +{ + gomp_mutex_destroy (&ptrlock->lock); +} + +#endif /* GOMP_PTRLOCK_H */ diff --git a/libgomp/config/posix95/lock.c b/libgomp/config/posix95/lock.c index 2416f1131c7..e27437ead16 100644 --- a/libgomp/config/posix95/lock.c +++ b/libgomp/config/posix95/lock.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2006 Free Software Foundation, Inc. +/* Copyright (C) 2006, 2008 Free Software Foundation, Inc. This file is part of the GNU OpenMP Library (libgomp). @@ -33,39 +33,212 @@ #include "libgomp.h" +#ifdef HAVE_BROKEN_POSIX_SEMAPHORES +void +gomp_init_lock_30 (omp_lock_t *lock) +{ + pthread_mutex_init (lock, NULL); +} + +void +gomp_destroy_lock_30 (omp_lock_t *lock) +{ + pthread_mutex_destroy (lock); +} + +void +gomp_set_lock_30 (omp_lock_t *lock) +{ + pthread_mutex_lock (lock); +} + +void +gomp_unset_lock_30 (omp_lock_t *lock) +{ + pthread_mutex_unlock (lock); +} + +int +gomp_test_lock_30 (omp_lock_t *lock) +{ + return pthread_mutex_trylock (lock) == 0; +} + +void +gomp_init_nest_lock_30 (omp_nest_lock_t *lock) +{ + pthread_mutex_init (&lock->lock, NULL); + lock->owner = NULL; + lock->count = 0; +} + +void +gomp_destroy_nest_lock_30 (omp_nest_lock_t *lock) +{ + pthread_mutex_destroy (&lock->lock); +} void -omp_init_lock (omp_lock_t *lock) +gomp_set_nest_lock_30 (omp_nest_lock_t *lock) +{ + void *me = gomp_icv (true); + + if (lock->owner != me) + { + pthread_mutex_lock (&lock->lock); + lock->owner = me; + } + + lock->count++; +} + +void +gomp_unset_nest_lock_30 (omp_nest_lock_t *lock) +{ + lock->count--; + + if (lock->count == 0) + { + lock->owner = NULL; + pthread_mutex_unlock (&lock->lock); + } +} + +int +gomp_test_nest_lock_30 (omp_nest_lock_t *lock) +{ + void *me = gomp_icv (true); + + if (lock->owner != me) + { + if (pthread_mutex_trylock (&lock->lock) != 0) + return 0; + lock->owner = me; + } + + return ++lock->count; +} + +#else + +void +gomp_init_lock_30 (omp_lock_t *lock) +{ + sem_init (lock, 0, 1); +} + +void +gomp_destroy_lock_30 (omp_lock_t *lock) +{ + sem_destroy (lock); +} + +void +gomp_set_lock_30 (omp_lock_t *lock) +{ + while (sem_wait (lock) != 0) + ; +} + +void +gomp_unset_lock_30 (omp_lock_t *lock) +{ + sem_post (lock); +} + +int +gomp_test_lock_30 (omp_lock_t *lock) +{ + return sem_trywait (lock) == 0; +} + +void +gomp_init_nest_lock_30 (omp_nest_lock_t *lock) +{ + sem_init (&lock->lock, 0, 1); + lock->count = 0; + lock->owner = NULL; +} + +void +gomp_destroy_nest_lock_30 (omp_nest_lock_t *lock) +{ + sem_destroy (&lock->lock); +} + +void +gomp_set_nest_lock_30 (omp_nest_lock_t *lock) +{ + void *me = gomp_icv (true); + + if (lock->owner != me) + { + while (sem_wait (&lock->lock) != 0) + ; + lock->owner = me; + } + lock->count++; +} + +void +gomp_unset_nest_lock_30 (omp_nest_lock_t *lock) +{ + if (--lock->count == 0) + { + lock->owner = NULL; + sem_post (&lock->lock); + } +} + +int +gomp_test_nest_lock_30 (omp_nest_lock_t *lock) +{ + void *me = gomp_icv (true); + + if (lock->owner != me) + { + if (sem_trywait (&lock->lock) != 0) + return 0; + lock->owner = me; + } + + return ++lock->count; +} +#endif + +#ifdef LIBGOMP_GNU_SYMBOL_VERSIONING +void +gomp_init_lock_25 (omp_lock_25_t *lock) { pthread_mutex_init (lock, NULL); } void -omp_destroy_lock (omp_lock_t *lock) +gomp_destroy_lock_25 (omp_lock_25_t *lock) { pthread_mutex_destroy (lock); } void -omp_set_lock (omp_lock_t *lock) +gomp_set_lock_25 (omp_lock_25_t *lock) { pthread_mutex_lock (lock); } void -omp_unset_lock (omp_lock_t *lock) +gomp_unset_lock_25 (omp_lock_25_t *lock) { pthread_mutex_unlock (lock); } int -omp_test_lock (omp_lock_t *lock) +gomp_test_lock_25 (omp_lock_25_t *lock) { return pthread_mutex_trylock (lock) == 0; } void -omp_init_nest_lock (omp_nest_lock_t *lock) +gomp_init_nest_lock_25 (omp_nest_lock_25_t *lock) { pthread_mutex_init (&lock->lock, NULL); lock->owner = (pthread_t) 0; @@ -73,13 +246,13 @@ omp_init_nest_lock (omp_nest_lock_t *lock) } void -omp_destroy_nest_lock (omp_nest_lock_t *lock) +gomp_destroy_nest_lock_25 (omp_nest_lock_25_t *lock) { pthread_mutex_destroy (&lock->lock); } void -omp_set_nest_lock (omp_nest_lock_t *lock) +gomp_set_nest_lock_25 (omp_nest_lock_25_t *lock) { pthread_t me = pthread_self (); @@ -93,7 +266,7 @@ omp_set_nest_lock (omp_nest_lock_t *lock) } void -omp_unset_nest_lock (omp_nest_lock_t *lock) +gomp_unset_nest_lock_25 (omp_nest_lock_25_t *lock) { lock->count--; @@ -105,7 +278,7 @@ omp_unset_nest_lock (omp_nest_lock_t *lock) } int -omp_test_nest_lock (omp_nest_lock_t *lock) +gomp_test_nest_lock_25 (omp_nest_lock_25_t *lock) { pthread_t me = pthread_self (); @@ -119,6 +292,19 @@ omp_test_nest_lock (omp_nest_lock_t *lock) return ++lock->count; } +omp_lock_symver (omp_init_lock) +omp_lock_symver (omp_destroy_lock) +omp_lock_symver (omp_set_lock) +omp_lock_symver (omp_unset_lock) +omp_lock_symver (omp_test_lock) +omp_lock_symver (omp_init_nest_lock) +omp_lock_symver (omp_destroy_nest_lock) +omp_lock_symver (omp_set_nest_lock) +omp_lock_symver (omp_unset_nest_lock) +omp_lock_symver (omp_test_nest_lock) + +#else + ialias (omp_init_lock) ialias (omp_init_nest_lock) ialias (omp_destroy_lock) @@ -129,3 +315,5 @@ ialias (omp_unset_lock) ialias (omp_unset_nest_lock) ialias (omp_test_lock) ialias (omp_test_nest_lock) + +#endif diff --git a/libgomp/config/posix95/omp-lock.h b/libgomp/config/posix95/omp-lock.h index c446aec01e5..b542ba13192 100644 --- a/libgomp/config/posix95/omp-lock.h +++ b/libgomp/config/posix95/omp-lock.h @@ -6,12 +6,16 @@ same PTHREADS primitive. */ #include <pthread.h> +#include <semaphore.h> +typedef pthread_mutex_t omp_lock_25_t; +typedef struct { pthread_mutex_t lock; pthread_t owner; int count; } omp_nest_lock_25_t; +#ifdef HAVE_BROKEN_POSIX_SEMAPHORES +/* If we don't have working semaphores, we'll make all explicit tasks + tied to the creating thread. */ typedef pthread_mutex_t omp_lock_t; - -typedef struct -{ - pthread_mutex_t lock; - pthread_t owner; - int count; -} omp_nest_lock_t; +typedef struct { pthread_mutex_t lock; int count; void *owner; } omp_nest_lock_t; +#else +typedef sem_t omp_lock_t; +typedef struct { sem_t lock; int count; void *owner; } omp_nest_lock_t; +#endif diff --git a/libgomp/configure b/libgomp/configure index 0fda5c740a0..f22c8a06c31 100755 --- a/libgomp/configure +++ b/libgomp/configure @@ -457,7 +457,7 @@ ac_includes_default="\ # include <unistd.h> #endif" -ac_subst_vars='SHELL PATH_SEPARATOR PACKAGE_NAME PACKAGE_TARNAME PACKAGE_VERSION PACKAGE_STRING PACKAGE_BUGREPORT exec_prefix prefix program_transform_name bindir sbindir libexecdir datadir sysconfdir sharedstatedir localstatedir libdir includedir oldincludedir infodir mandir build_alias host_alias target_alias DEFS ECHO_C ECHO_N ECHO_T LIBS GENINSRC_TRUE GENINSRC_FALSE build build_cpu build_vendor build_os host host_cpu host_vendor host_os target target_cpu target_vendor target_os INSTALL_PROGRAM INSTALL_SCRIPT INSTALL_DATA CYGPATH_W PACKAGE VERSION ACLOCAL AUTOCONF AUTOMAKE AUTOHEADER MAKEINFO install_sh STRIP ac_ct_STRIP INSTALL_STRIP_PROGRAM mkdir_p AWK SET_MAKE am__leading_dot AMTAR am__tar am__untar multi_basedir toolexecdir toolexeclibdir CC ac_ct_CC EXEEXT OBJEXT DEPDIR am__include am__quote AMDEP_TRUE AMDEP_FALSE AMDEPBACKSLASH CCDEPMODE am__fastdepCC_TRUE am__fastdepCC_FALSE CFLAGS AR ac_ct_AR RANLIB ac_ct_RANLIB PERL BUILD_INFO_TRUE BUILD_INFO_FALSE LIBTOOL SED EGREP FGREP GREP LD DUMPBIN ac_ct_DUMPBIN NM LN_S lt_ECHO CPP CPPFLAGS enable_shared enable_static MAINTAINER_MODE_TRUE MAINTAINER_MODE_FALSE MAINT FC FCFLAGS LDFLAGS ac_ct_FC libtool_VERSION SECTION_LDFLAGS OPT_LDFLAGS LIBGOMP_BUILD_VERSIONED_SHLIB_TRUE LIBGOMP_BUILD_VERSIONED_SHLIB_FALSE config_path XCFLAGS XLDFLAGS link_gomp USE_FORTRAN_TRUE USE_FORTRAN_FALSE OMP_LOCK_SIZE OMP_LOCK_ALIGN OMP_NEST_LOCK_SIZE OMP_NEST_LOCK_ALIGN OMP_LOCK_KIND OMP_NEST_LOCK_KIND LIBOBJS LTLIBOBJS' +ac_subst_vars='SHELL PATH_SEPARATOR PACKAGE_NAME PACKAGE_TARNAME PACKAGE_VERSION PACKAGE_STRING PACKAGE_BUGREPORT exec_prefix prefix program_transform_name bindir sbindir libexecdir datadir sysconfdir sharedstatedir localstatedir libdir includedir oldincludedir infodir mandir build_alias host_alias target_alias DEFS ECHO_C ECHO_N ECHO_T LIBS GENINSRC_TRUE GENINSRC_FALSE build build_cpu build_vendor build_os host host_cpu host_vendor host_os target target_cpu target_vendor target_os INSTALL_PROGRAM INSTALL_SCRIPT INSTALL_DATA CYGPATH_W PACKAGE VERSION ACLOCAL AUTOCONF AUTOMAKE AUTOHEADER MAKEINFO install_sh STRIP ac_ct_STRIP INSTALL_STRIP_PROGRAM mkdir_p AWK SET_MAKE am__leading_dot AMTAR am__tar am__untar multi_basedir toolexecdir toolexeclibdir CC ac_ct_CC EXEEXT OBJEXT DEPDIR am__include am__quote AMDEP_TRUE AMDEP_FALSE AMDEPBACKSLASH CCDEPMODE am__fastdepCC_TRUE am__fastdepCC_FALSE CFLAGS AR ac_ct_AR RANLIB ac_ct_RANLIB PERL BUILD_INFO_TRUE BUILD_INFO_FALSE LIBTOOL SED EGREP FGREP GREP LD DUMPBIN ac_ct_DUMPBIN NM LN_S lt_ECHO CPP CPPFLAGS enable_shared enable_static MAINTAINER_MODE_TRUE MAINTAINER_MODE_FALSE MAINT FC FCFLAGS LDFLAGS ac_ct_FC libtool_VERSION SECTION_LDFLAGS OPT_LDFLAGS LIBGOMP_BUILD_VERSIONED_SHLIB_TRUE LIBGOMP_BUILD_VERSIONED_SHLIB_FALSE config_path XCFLAGS XLDFLAGS link_gomp USE_FORTRAN_TRUE USE_FORTRAN_FALSE OMP_LOCK_SIZE OMP_LOCK_ALIGN OMP_NEST_LOCK_SIZE OMP_NEST_LOCK_ALIGN OMP_LOCK_KIND OMP_NEST_LOCK_KIND OMP_LOCK_25_SIZE OMP_LOCK_25_ALIGN OMP_NEST_LOCK_25_SIZE OMP_NEST_LOCK_25_ALIGN OMP_LOCK_25_KIND OMP_NEST_LOCK_25_KIND LIBOBJS LTLIBOBJS' ac_subst_files='' # Initialize some variables set by options. @@ -17988,6 +17988,14 @@ fi echo "$as_me: versioning on shared library symbols is $enable_symvers" >&6;} +if test $enable_symvers = gnu; then + +cat >>confdefs.h <<\_ACEOF +#define LIBGOMP_GNU_SYMBOL_VERSIONING 1 +_ACEOF + +fi + # Get target configury. . ${srcdir}/configure.tgt CFLAGS="$save_CFLAGS $XCFLAGS" @@ -18156,7 +18164,7 @@ fi save_CFLAGS="$CFLAGS" for i in $config_path; do if test -f $srcdir/config/$i/omp-lock.h; then - CFLAGS="$CFLAGS -include $srcdir/config/$i/omp-lock.h" + CFLAGS="$CFLAGS -include confdefs.h -include $srcdir/config/$i/omp-lock.h" break fi done @@ -19471,6 +19479,1316 @@ rm -f core *.core gmon.out bb.out conftest$ac_exeext conftest.$ac_objext conftes fi fi rm -f conftest.val +if test "$cross_compiling" = yes; then + # Depending upon the size, compute the lo and hi bounds. +cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((sizeof (omp_lock_25_t)) >= 0)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_lo=0 ac_mid=0 + while :; do + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((sizeof (omp_lock_25_t)) <= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=$ac_mid; break +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo=`expr $ac_mid + 1` + if test $ac_lo -le $ac_mid; then + ac_lo= ac_hi= + break + fi + ac_mid=`expr 2 '*' $ac_mid + 1` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext + done +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((sizeof (omp_lock_25_t)) < 0)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=-1 ac_mid=-1 + while :; do + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((sizeof (omp_lock_25_t)) >= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_lo=$ac_mid; break +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_hi=`expr '(' $ac_mid ')' - 1` + if test $ac_mid -le $ac_hi; then + ac_lo= ac_hi= + break + fi + ac_mid=`expr 2 '*' $ac_mid` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext + done +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo= ac_hi= +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +# Binary search between lo and hi bounds. +while test "x$ac_lo" != "x$ac_hi"; do + ac_mid=`expr '(' $ac_hi - $ac_lo ')' / 2 + $ac_lo` + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((sizeof (omp_lock_25_t)) <= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=$ac_mid +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo=`expr '(' $ac_mid ')' + 1` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +done +case $ac_lo in +?*) OMP_LOCK_25_SIZE=$ac_lo;; +'') { { echo "$as_me:$LINENO: error: unsupported system, cannot find sizeof (omp_lock_25_t)" >&5 +echo "$as_me: error: unsupported system, cannot find sizeof (omp_lock_25_t)" >&2;} + { (exit 1); exit 1; }; } ;; +esac +else + if test "$cross_compiling" = yes; then + { { echo "$as_me:$LINENO: error: cannot run test program while cross compiling +See \`config.log' for more details." >&5 +echo "$as_me: error: cannot run test program while cross compiling +See \`config.log' for more details." >&2;} + { (exit 1); exit 1; }; } +else + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +long longval () { return sizeof (omp_lock_25_t); } +unsigned long ulongval () { return sizeof (omp_lock_25_t); } +#include <stdio.h> +#include <stdlib.h> +int +main () +{ + + FILE *f = fopen ("conftest.val", "w"); + if (! f) + exit (1); + if ((sizeof (omp_lock_25_t)) < 0) + { + long i = longval (); + if (i != (sizeof (omp_lock_25_t))) + exit (1); + fprintf (f, "%ld\n", i); + } + else + { + unsigned long i = ulongval (); + if (i != (sizeof (omp_lock_25_t))) + exit (1); + fprintf (f, "%lu\n", i); + } + exit (ferror (f) || fclose (f) != 0); + + ; + return 0; +} +_ACEOF +rm -f conftest$ac_exeext +if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 + (eval $ac_link) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && { ac_try='./conftest$ac_exeext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + OMP_LOCK_25_SIZE=`cat conftest.val` +else + echo "$as_me: program exited with status $ac_status" >&5 +echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +( exit $ac_status ) +{ { echo "$as_me:$LINENO: error: unsupported system, cannot find sizeof (omp_lock_25_t)" >&5 +echo "$as_me: error: unsupported system, cannot find sizeof (omp_lock_25_t)" >&2;} + { (exit 1); exit 1; }; } +fi +rm -f core *.core gmon.out bb.out conftest$ac_exeext conftest.$ac_objext conftest.$ac_ext +fi +fi +rm -f conftest.val +if test "$cross_compiling" = yes; then + # Depending upon the size, compute the lo and hi bounds. +cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((__alignof (omp_lock_25_t)) >= 0)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_lo=0 ac_mid=0 + while :; do + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((__alignof (omp_lock_25_t)) <= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=$ac_mid; break +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo=`expr $ac_mid + 1` + if test $ac_lo -le $ac_mid; then + ac_lo= ac_hi= + break + fi + ac_mid=`expr 2 '*' $ac_mid + 1` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext + done +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((__alignof (omp_lock_25_t)) < 0)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=-1 ac_mid=-1 + while :; do + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((__alignof (omp_lock_25_t)) >= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_lo=$ac_mid; break +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_hi=`expr '(' $ac_mid ')' - 1` + if test $ac_mid -le $ac_hi; then + ac_lo= ac_hi= + break + fi + ac_mid=`expr 2 '*' $ac_mid` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext + done +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo= ac_hi= +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +# Binary search between lo and hi bounds. +while test "x$ac_lo" != "x$ac_hi"; do + ac_mid=`expr '(' $ac_hi - $ac_lo ')' / 2 + $ac_lo` + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((__alignof (omp_lock_25_t)) <= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=$ac_mid +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo=`expr '(' $ac_mid ')' + 1` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +done +case $ac_lo in +?*) OMP_LOCK_25_ALIGN=$ac_lo;; +'') ;; +esac +else + if test "$cross_compiling" = yes; then + { { echo "$as_me:$LINENO: error: cannot run test program while cross compiling +See \`config.log' for more details." >&5 +echo "$as_me: error: cannot run test program while cross compiling +See \`config.log' for more details." >&2;} + { (exit 1); exit 1; }; } +else + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +long longval () { return __alignof (omp_lock_25_t); } +unsigned long ulongval () { return __alignof (omp_lock_25_t); } +#include <stdio.h> +#include <stdlib.h> +int +main () +{ + + FILE *f = fopen ("conftest.val", "w"); + if (! f) + exit (1); + if ((__alignof (omp_lock_25_t)) < 0) + { + long i = longval (); + if (i != (__alignof (omp_lock_25_t))) + exit (1); + fprintf (f, "%ld\n", i); + } + else + { + unsigned long i = ulongval (); + if (i != (__alignof (omp_lock_25_t))) + exit (1); + fprintf (f, "%lu\n", i); + } + exit (ferror (f) || fclose (f) != 0); + + ; + return 0; +} +_ACEOF +rm -f conftest$ac_exeext +if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 + (eval $ac_link) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && { ac_try='./conftest$ac_exeext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + OMP_LOCK_25_ALIGN=`cat conftest.val` +else + echo "$as_me: program exited with status $ac_status" >&5 +echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +fi +rm -f core *.core gmon.out bb.out conftest$ac_exeext conftest.$ac_objext conftest.$ac_ext +fi +fi +rm -f conftest.val +if test "$cross_compiling" = yes; then + # Depending upon the size, compute the lo and hi bounds. +cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((sizeof (omp_nest_lock_25_t)) >= 0)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_lo=0 ac_mid=0 + while :; do + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((sizeof (omp_nest_lock_25_t)) <= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=$ac_mid; break +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo=`expr $ac_mid + 1` + if test $ac_lo -le $ac_mid; then + ac_lo= ac_hi= + break + fi + ac_mid=`expr 2 '*' $ac_mid + 1` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext + done +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((sizeof (omp_nest_lock_25_t)) < 0)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=-1 ac_mid=-1 + while :; do + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((sizeof (omp_nest_lock_25_t)) >= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_lo=$ac_mid; break +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_hi=`expr '(' $ac_mid ')' - 1` + if test $ac_mid -le $ac_hi; then + ac_lo= ac_hi= + break + fi + ac_mid=`expr 2 '*' $ac_mid` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext + done +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo= ac_hi= +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +# Binary search between lo and hi bounds. +while test "x$ac_lo" != "x$ac_hi"; do + ac_mid=`expr '(' $ac_hi - $ac_lo ')' / 2 + $ac_lo` + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((sizeof (omp_nest_lock_25_t)) <= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=$ac_mid +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo=`expr '(' $ac_mid ')' + 1` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +done +case $ac_lo in +?*) OMP_NEST_LOCK_25_SIZE=$ac_lo;; +'') ;; +esac +else + if test "$cross_compiling" = yes; then + { { echo "$as_me:$LINENO: error: cannot run test program while cross compiling +See \`config.log' for more details." >&5 +echo "$as_me: error: cannot run test program while cross compiling +See \`config.log' for more details." >&2;} + { (exit 1); exit 1; }; } +else + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +long longval () { return sizeof (omp_nest_lock_25_t); } +unsigned long ulongval () { return sizeof (omp_nest_lock_25_t); } +#include <stdio.h> +#include <stdlib.h> +int +main () +{ + + FILE *f = fopen ("conftest.val", "w"); + if (! f) + exit (1); + if ((sizeof (omp_nest_lock_25_t)) < 0) + { + long i = longval (); + if (i != (sizeof (omp_nest_lock_25_t))) + exit (1); + fprintf (f, "%ld\n", i); + } + else + { + unsigned long i = ulongval (); + if (i != (sizeof (omp_nest_lock_25_t))) + exit (1); + fprintf (f, "%lu\n", i); + } + exit (ferror (f) || fclose (f) != 0); + + ; + return 0; +} +_ACEOF +rm -f conftest$ac_exeext +if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 + (eval $ac_link) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && { ac_try='./conftest$ac_exeext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + OMP_NEST_LOCK_25_SIZE=`cat conftest.val` +else + echo "$as_me: program exited with status $ac_status" >&5 +echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +fi +rm -f core *.core gmon.out bb.out conftest$ac_exeext conftest.$ac_objext conftest.$ac_ext +fi +fi +rm -f conftest.val +if test "$cross_compiling" = yes; then + # Depending upon the size, compute the lo and hi bounds. +cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((__alignof (omp_nest_lock_25_t)) >= 0)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_lo=0 ac_mid=0 + while :; do + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((__alignof (omp_nest_lock_25_t)) <= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=$ac_mid; break +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo=`expr $ac_mid + 1` + if test $ac_lo -le $ac_mid; then + ac_lo= ac_hi= + break + fi + ac_mid=`expr 2 '*' $ac_mid + 1` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext + done +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((__alignof (omp_nest_lock_25_t)) < 0)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=-1 ac_mid=-1 + while :; do + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((__alignof (omp_nest_lock_25_t)) >= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_lo=$ac_mid; break +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_hi=`expr '(' $ac_mid ')' - 1` + if test $ac_mid -le $ac_hi; then + ac_lo= ac_hi= + break + fi + ac_mid=`expr 2 '*' $ac_mid` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext + done +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo= ac_hi= +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +# Binary search between lo and hi bounds. +while test "x$ac_lo" != "x$ac_hi"; do + ac_mid=`expr '(' $ac_hi - $ac_lo ')' / 2 + $ac_lo` + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +int +main () +{ +static int test_array [1 - 2 * !((__alignof (omp_nest_lock_25_t)) <= $ac_mid)]; +test_array [0] = 0 + + ; + return 0; +} +_ACEOF +rm -f conftest.$ac_objext +if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 + (eval $ac_compile) 2>conftest.er1 + ac_status=$? + grep -v '^ *+' conftest.er1 >conftest.err + rm -f conftest.er1 + cat conftest.err >&5 + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && + { ac_try='test -z "$ac_c_werror_flag" + || test ! -s conftest.err' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } && + { ac_try='test -s conftest.$ac_objext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + ac_hi=$ac_mid +else + echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +ac_lo=`expr '(' $ac_mid ')' + 1` +fi +rm -f conftest.err conftest.$ac_objext conftest.$ac_ext +done +case $ac_lo in +?*) OMP_NEST_LOCK_25_ALIGN=$ac_lo;; +'') ;; +esac +else + if test "$cross_compiling" = yes; then + { { echo "$as_me:$LINENO: error: cannot run test program while cross compiling +See \`config.log' for more details." >&5 +echo "$as_me: error: cannot run test program while cross compiling +See \`config.log' for more details." >&2;} + { (exit 1); exit 1; }; } +else + cat >conftest.$ac_ext <<_ACEOF +/* confdefs.h. */ +_ACEOF +cat confdefs.h >>conftest.$ac_ext +cat >>conftest.$ac_ext <<_ACEOF +/* end confdefs.h. */ + +long longval () { return __alignof (omp_nest_lock_25_t); } +unsigned long ulongval () { return __alignof (omp_nest_lock_25_t); } +#include <stdio.h> +#include <stdlib.h> +int +main () +{ + + FILE *f = fopen ("conftest.val", "w"); + if (! f) + exit (1); + if ((__alignof (omp_nest_lock_25_t)) < 0) + { + long i = longval (); + if (i != (__alignof (omp_nest_lock_25_t))) + exit (1); + fprintf (f, "%ld\n", i); + } + else + { + unsigned long i = ulongval (); + if (i != (__alignof (omp_nest_lock_25_t))) + exit (1); + fprintf (f, "%lu\n", i); + } + exit (ferror (f) || fclose (f) != 0); + + ; + return 0; +} +_ACEOF +rm -f conftest$ac_exeext +if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 + (eval $ac_link) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } && { ac_try='./conftest$ac_exeext' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; }; then + OMP_NEST_LOCK_25_ALIGN=`cat conftest.val` +else + echo "$as_me: program exited with status $ac_status" >&5 +echo "$as_me: failed program was:" >&5 +sed 's/^/| /' conftest.$ac_ext >&5 + +fi +rm -f core *.core gmon.out bb.out conftest$ac_exeext conftest.$ac_objext conftest.$ac_ext +fi +fi +rm -f conftest.val # If the lock fits in an integer, then arrange for Fortran to use that # integer. If it doesn't, then arrange for Fortran to use a pointer. @@ -19485,6 +20803,20 @@ fi if test $OMP_NEST_LOCK_SIZE -gt 8 || test $OMP_NEST_LOCK_ALIGN -gt $OMP_NEST_LOCK_SIZE; then OMP_NEST_LOCK_KIND=8 fi +OMP_LOCK_25_KIND=$OMP_LOCK_25_SIZE +OMP_NEST_LOCK_25_KIND=$OMP_NEST_LOCK_25_SIZE +if test $OMP_LOCK_25_SIZE -gt 8 || test $OMP_LOCK_25_ALIGN -gt $OMP_LOCK_25_SIZE; then + OMP_LOCK_25_KIND=8 +fi +if test $OMP_NEST_LOCK_25_SIZE -gt 8 || test $OMP_NEST_LOCK_25_ALIGN -gt $OMP_NEST_LOCK_25_SIZE; then + OMP_NEST_LOCK_25_KIND=8 +fi + + + + + + @@ -20640,6 +21972,12 @@ s,@OMP_NEST_LOCK_SIZE@,$OMP_NEST_LOCK_SIZE,;t t s,@OMP_NEST_LOCK_ALIGN@,$OMP_NEST_LOCK_ALIGN,;t t s,@OMP_LOCK_KIND@,$OMP_LOCK_KIND,;t t s,@OMP_NEST_LOCK_KIND@,$OMP_NEST_LOCK_KIND,;t t +s,@OMP_LOCK_25_SIZE@,$OMP_LOCK_25_SIZE,;t t +s,@OMP_LOCK_25_ALIGN@,$OMP_LOCK_25_ALIGN,;t t +s,@OMP_NEST_LOCK_25_SIZE@,$OMP_NEST_LOCK_25_SIZE,;t t +s,@OMP_NEST_LOCK_25_ALIGN@,$OMP_NEST_LOCK_25_ALIGN,;t t +s,@OMP_LOCK_25_KIND@,$OMP_LOCK_25_KIND,;t t +s,@OMP_NEST_LOCK_25_KIND@,$OMP_NEST_LOCK_25_KIND,;t t s,@LIBOBJS@,$LIBOBJS,;t t s,@LTLIBOBJS@,$LTLIBOBJS,;t t CEOF diff --git a/libgomp/configure.ac b/libgomp/configure.ac index 47c61bea625..12c92340e8c 100644 --- a/libgomp/configure.ac +++ b/libgomp/configure.ac @@ -228,6 +228,11 @@ LIBGOMP_CHECK_ATTRIBUTE_DLLEXPORT LIBGOMP_CHECK_ATTRIBUTE_ALIAS LIBGOMP_ENABLE_SYMVERS +if test $enable_symvers = gnu; then + AC_DEFINE(LIBGOMP_GNU_SYMBOL_VERSIONING, 1, + [Define to 1 if GNU symbol versioning is used for libgomp.]) +fi + # Get target configury. . ${srcdir}/configure.tgt CFLAGS="$save_CFLAGS $XCFLAGS" @@ -272,7 +277,7 @@ AM_CONDITIONAL([USE_FORTRAN], [test "$ac_cv_fc_compiler_gnu" = yes]) save_CFLAGS="$CFLAGS" for i in $config_path; do if test -f $srcdir/config/$i/omp-lock.h; then - CFLAGS="$CFLAGS -include $srcdir/config/$i/omp-lock.h" + CFLAGS="$CFLAGS -include confdefs.h -include $srcdir/config/$i/omp-lock.h" break fi done @@ -282,6 +287,11 @@ _AC_COMPUTE_INT([sizeof (omp_lock_t)], [OMP_LOCK_SIZE],, _AC_COMPUTE_INT([__alignof (omp_lock_t)], [OMP_LOCK_ALIGN]) _AC_COMPUTE_INT([sizeof (omp_nest_lock_t)], [OMP_NEST_LOCK_SIZE]) _AC_COMPUTE_INT([__alignof (omp_nest_lock_t)], [OMP_NEST_LOCK_ALIGN]) +_AC_COMPUTE_INT([sizeof (omp_lock_25_t)], [OMP_LOCK_25_SIZE],, + [AC_MSG_ERROR([unsupported system, cannot find sizeof (omp_lock_25_t)])]) +_AC_COMPUTE_INT([__alignof (omp_lock_25_t)], [OMP_LOCK_25_ALIGN]) +_AC_COMPUTE_INT([sizeof (omp_nest_lock_25_t)], [OMP_NEST_LOCK_25_SIZE]) +_AC_COMPUTE_INT([__alignof (omp_nest_lock_25_t)], [OMP_NEST_LOCK_25_ALIGN]) # If the lock fits in an integer, then arrange for Fortran to use that # integer. If it doesn't, then arrange for Fortran to use a pointer. @@ -296,6 +306,14 @@ fi if test $OMP_NEST_LOCK_SIZE -gt 8 || test $OMP_NEST_LOCK_ALIGN -gt $OMP_NEST_LOCK_SIZE; then OMP_NEST_LOCK_KIND=8 fi +OMP_LOCK_25_KIND=$OMP_LOCK_25_SIZE +OMP_NEST_LOCK_25_KIND=$OMP_NEST_LOCK_25_SIZE +if test $OMP_LOCK_25_SIZE -gt 8 || test $OMP_LOCK_25_ALIGN -gt $OMP_LOCK_25_SIZE; then + OMP_LOCK_25_KIND=8 +fi +if test $OMP_NEST_LOCK_25_SIZE -gt 8 || test $OMP_NEST_LOCK_25_ALIGN -gt $OMP_NEST_LOCK_25_SIZE; then + OMP_NEST_LOCK_25_KIND=8 +fi AC_SUBST(OMP_LOCK_SIZE) AC_SUBST(OMP_LOCK_ALIGN) @@ -303,6 +321,12 @@ AC_SUBST(OMP_NEST_LOCK_SIZE) AC_SUBST(OMP_NEST_LOCK_ALIGN) AC_SUBST(OMP_LOCK_KIND) AC_SUBST(OMP_NEST_LOCK_KIND) +AC_SUBST(OMP_LOCK_25_SIZE) +AC_SUBST(OMP_LOCK_25_ALIGN) +AC_SUBST(OMP_NEST_LOCK_25_SIZE) +AC_SUBST(OMP_NEST_LOCK_25_ALIGN) +AC_SUBST(OMP_LOCK_25_KIND) +AC_SUBST(OMP_NEST_LOCK_25_KIND) CFLAGS="$save_CFLAGS" AC_CONFIG_FILES(omp.h omp_lib.h omp_lib.f90 libgomp_f.h) diff --git a/libgomp/env.c b/libgomp/env.c index 6c6a35228eb..022fb1bb0ad 100644 --- a/libgomp/env.c +++ b/libgomp/env.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005, 2006, 2007 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2006, 2007, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -13,7 +13,7 @@ FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. - You should have received a copy of the GNU Lesser General Public License + You should have received a copy of the GNU Lesser General Public License along with libgomp; see the file COPYING.LIB. If not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */ @@ -48,13 +48,24 @@ #include <errno.h> -unsigned long gomp_nthreads_var = 1; -bool gomp_dyn_var = false; -bool gomp_nest_var = false; -enum gomp_schedule_type gomp_run_sched_var = GFS_DYNAMIC; -unsigned long gomp_run_sched_chunk = 1; +struct gomp_task_icv gomp_global_icv = { + .nthreads_var = 1, + .run_sched_var = GFS_DYNAMIC, + .run_sched_modifier = 1, + .dyn_var = false, + .nest_var = false +}; + unsigned short *gomp_cpu_affinity; size_t gomp_cpu_affinity_len; +unsigned long gomp_max_active_levels_var = INT_MAX; +unsigned long gomp_thread_limit_var = ULONG_MAX; +unsigned long gomp_remaining_threads_count; +#ifndef HAVE_SYNC_BUILTINS +gomp_mutex_t gomp_remaining_threads_lock; +#endif +unsigned long gomp_available_cpus = 1, gomp_managed_threads = 1; +unsigned long long gomp_spin_count_var, gomp_throttled_spin_count_var; /* Parse the OMP_SCHEDULE environment variable. */ @@ -72,19 +83,24 @@ parse_schedule (void) ++env; if (strncasecmp (env, "static", 6) == 0) { - gomp_run_sched_var = GFS_STATIC; + gomp_global_icv.run_sched_var = GFS_STATIC; env += 6; } else if (strncasecmp (env, "dynamic", 7) == 0) { - gomp_run_sched_var = GFS_DYNAMIC; + gomp_global_icv.run_sched_var = GFS_DYNAMIC; env += 7; } else if (strncasecmp (env, "guided", 6) == 0) { - gomp_run_sched_var = GFS_GUIDED; + gomp_global_icv.run_sched_var = GFS_GUIDED; env += 6; } + else if (strncasecmp (env, "auto", 4) == 0) + { + gomp_global_icv.run_sched_var = GFS_AUTO; + env += 4; + } else goto unknown; @@ -109,7 +125,10 @@ parse_schedule (void) if (*end != '\0') goto invalid; - gomp_run_sched_chunk = value; + if ((int)value != value) + goto invalid; + + gomp_global_icv.run_sched_modifier = value; return; unknown: @@ -122,7 +141,7 @@ parse_schedule (void) return; } -/* Parse an unsigned long environment varible. Return true if one was +/* Parse an unsigned long environment variable. Return true if one was present and it was successfully parsed. */ static bool @@ -158,7 +177,141 @@ parse_unsigned_long (const char *name, unsigned long *pvalue) return false; } -/* Parse a boolean value for environment variable NAME and store the +/* Parse the OMP_STACKSIZE environment varible. Return true if one was + present and it was successfully parsed. */ + +static bool +parse_stacksize (const char *name, unsigned long *pvalue) +{ + char *env, *end; + unsigned long value, shift = 10; + + env = getenv (name); + if (env == NULL) + return false; + + while (isspace ((unsigned char) *env)) + ++env; + if (*env == '\0') + goto invalid; + + errno = 0; + value = strtoul (env, &end, 10); + if (errno) + goto invalid; + + while (isspace ((unsigned char) *end)) + ++end; + if (*end != '\0') + { + switch (tolower (*end)) + { + case 'b': + shift = 0; + break; + case 'k': + break; + case 'm': + shift = 20; + break; + case 'g': + shift = 30; + break; + default: + goto invalid; + } + ++end; + while (isspace ((unsigned char) *end)) + ++end; + if (*end != '\0') + goto invalid; + } + + if (((value << shift) >> shift) != value) + goto invalid; + + *pvalue = value << shift; + return true; + + invalid: + gomp_error ("Invalid value for environment variable %s", name); + return false; +} + +/* Parse the GOMP_SPINCOUNT environment varible. Return true if one was + present and it was successfully parsed. */ + +static bool +parse_spincount (const char *name, unsigned long long *pvalue) +{ + char *env, *end; + unsigned long long value, mult = 1; + + env = getenv (name); + if (env == NULL) + return false; + + while (isspace ((unsigned char) *env)) + ++env; + if (*env == '\0') + goto invalid; + + if (strncasecmp (env, "infinite", 8) == 0 + || strncasecmp (env, "infinity", 8) == 0) + { + value = ~0ULL; + end = env + 8; + goto check_tail; + } + + errno = 0; + value = strtoull (env, &end, 10); + if (errno) + goto invalid; + + while (isspace ((unsigned char) *end)) + ++end; + if (*end != '\0') + { + switch (tolower (*end)) + { + case 'k': + mult = 1000LL; + break; + case 'm': + mult = 1000LL * 1000LL; + break; + case 'g': + mult = 1000LL * 1000LL * 1000LL; + break; + case 't': + mult = 1000LL * 1000LL * 1000LL * 1000LL; + break; + default: + goto invalid; + } + ++end; + check_tail: + while (isspace ((unsigned char) *end)) + ++end; + if (*end != '\0') + goto invalid; + } + + if (value > ~0ULL / mult) + value = ~0ULL; + else + value *= mult; + + *pvalue = value; + return true; + + invalid: + gomp_error ("Invalid value for environment variable %s", name); + return false; +} + +/* Parse a boolean value for environment variable NAME and store the result in VALUE. */ static void @@ -190,6 +343,41 @@ parse_boolean (const char *name, bool *value) gomp_error ("Invalid value for environment variable %s", name); } +/* Parse the OMP_WAIT_POLICY environment variable and store the + result in gomp_active_wait_policy. */ + +static int +parse_wait_policy (void) +{ + const char *env; + int ret = -1; + + env = getenv ("OMP_WAIT_POLICY"); + if (env == NULL) + return -1; + + while (isspace ((unsigned char) *env)) + ++env; + if (strncasecmp (env, "active", 6) == 0) + { + ret = 1; + env += 6; + } + else if (strncasecmp (env, "passive", 7) == 0) + { + ret = 0; + env += 7; + } + else + env = "X"; + while (isspace ((unsigned char) *env)) + ++env; + if (*env == '\0') + return ret; + gomp_error ("Invalid value for environment variable OMP_WAIT_POLICY"); + return -1; +} + /* Parse the GOMP_CPU_AFFINITY environment varible. Return true if one was present and it was successfully parsed. */ @@ -285,27 +473,61 @@ static void __attribute__((constructor)) initialize_env (void) { unsigned long stacksize; + int wait_policy; /* Do a compile time check that mkomp_h.pl did good job. */ omp_check_defines (); parse_schedule (); - parse_boolean ("OMP_DYNAMIC", &gomp_dyn_var); - parse_boolean ("OMP_NESTED", &gomp_nest_var); - if (!parse_unsigned_long ("OMP_NUM_THREADS", &gomp_nthreads_var)) - gomp_init_num_threads (); + parse_boolean ("OMP_DYNAMIC", &gomp_global_icv.dyn_var); + parse_boolean ("OMP_NESTED", &gomp_global_icv.nest_var); + parse_unsigned_long ("OMP_MAX_ACTIVE_LEVELS", &gomp_max_active_levels_var); + parse_unsigned_long ("OMP_THREAD_LIMIT", &gomp_thread_limit_var); + if (gomp_thread_limit_var != ULONG_MAX) + { + gomp_remaining_threads_count = gomp_thread_limit_var - 1; +#ifndef HAVE_SYNC_BUILTINS + gomp_mutex_init (&gomp_remaining_threads_lock); +#endif + } + gomp_init_num_threads (); + gomp_available_cpus = gomp_global_icv.nthreads_var; + if (!parse_unsigned_long ("OMP_NUM_THREADS", &gomp_global_icv.nthreads_var)) + gomp_global_icv.nthreads_var = gomp_available_cpus; if (parse_affinity ()) gomp_init_affinity (); + wait_policy = parse_wait_policy (); + if (!parse_spincount ("GOMP_SPINCOUNT", &gomp_spin_count_var)) + { + /* Using a rough estimation of 100000 spins per msec, + use 5 min blocking for OMP_WAIT_POLICY=active, + 200 msec blocking when OMP_WAIT_POLICY is not specificed + and 0 when OMP_WAIT_POLICY=passive. + Depending on the CPU speed, this can be e.g. 5 times longer + or 5 times shorter. */ + if (wait_policy > 0) + gomp_spin_count_var = 30000000000LL; + else if (wait_policy < 0) + gomp_spin_count_var = 20000000LL; + } + /* gomp_throttled_spin_count_var is used when there are more libgomp + managed threads than available CPUs. Use very short spinning. */ + if (wait_policy > 0) + gomp_throttled_spin_count_var = 1000LL; + else if (wait_policy < 0) + gomp_throttled_spin_count_var = 100LL; + if (gomp_throttled_spin_count_var > gomp_spin_count_var) + gomp_throttled_spin_count_var = gomp_spin_count_var; /* Not strictly environment related, but ordering constructors is tricky. */ pthread_attr_init (&gomp_thread_attr); pthread_attr_setdetachstate (&gomp_thread_attr, PTHREAD_CREATE_DETACHED); - if (parse_unsigned_long ("GOMP_STACKSIZE", &stacksize)) + if (parse_stacksize ("OMP_STACKSIZE", &stacksize) + || parse_stacksize ("GOMP_STACKSIZE", &stacksize)) { int err; - stacksize *= 1024; err = pthread_attr_setstacksize (&gomp_thread_attr, stacksize); #ifdef PTHREAD_STACK_MIN @@ -331,31 +553,95 @@ initialize_env (void) void omp_set_num_threads (int n) { - gomp_nthreads_var = (n > 0 ? n : 1); + struct gomp_task_icv *icv = gomp_icv (true); + icv->nthreads_var = (n > 0 ? n : 1); } void omp_set_dynamic (int val) { - gomp_dyn_var = val; + struct gomp_task_icv *icv = gomp_icv (true); + icv->dyn_var = val; } int omp_get_dynamic (void) { - return gomp_dyn_var; + struct gomp_task_icv *icv = gomp_icv (false); + return icv->dyn_var; } void omp_set_nested (int val) { - gomp_nest_var = val; + struct gomp_task_icv *icv = gomp_icv (true); + icv->nest_var = val; } int omp_get_nested (void) { - return gomp_nest_var; + struct gomp_task_icv *icv = gomp_icv (false); + return icv->nest_var; +} + +void +omp_set_schedule (omp_sched_t kind, int modifier) +{ + struct gomp_task_icv *icv = gomp_icv (true); + switch (kind) + { + case omp_sched_static: + if (modifier < 1) + modifier = 0; + icv->run_sched_modifier = modifier; + break; + case omp_sched_dynamic: + case omp_sched_guided: + if (modifier < 1) + modifier = 1; + icv->run_sched_modifier = modifier; + break; + case omp_sched_auto: + break; + default: + return; + } + icv->run_sched_var = kind; +} + +void +omp_get_schedule (omp_sched_t *kind, int *modifier) +{ + struct gomp_task_icv *icv = gomp_icv (false); + *kind = icv->run_sched_var; + *modifier = icv->run_sched_modifier; +} + +int +omp_get_max_threads (void) +{ + struct gomp_task_icv *icv = gomp_icv (false); + return icv->nthreads_var; +} + +int +omp_get_thread_limit (void) +{ + return gomp_thread_limit_var > INT_MAX ? INT_MAX : gomp_thread_limit_var; +} + +void +omp_set_max_active_levels (int max_levels) +{ + if (max_levels > 0) + gomp_max_active_levels_var = max_levels; +} + +int +omp_get_max_active_levels (void) +{ + return gomp_max_active_levels_var; } ialias (omp_set_dynamic) @@ -363,3 +649,9 @@ ialias (omp_set_nested) ialias (omp_set_num_threads) ialias (omp_get_dynamic) ialias (omp_get_nested) +ialias (omp_set_schedule) +ialias (omp_get_schedule) +ialias (omp_get_max_threads) +ialias (omp_get_thread_limit) +ialias (omp_set_max_active_levels) +ialias (omp_get_max_active_levels) diff --git a/libgomp/fortran.c b/libgomp/fortran.c index f6f64c61be2..1e20aea28ee 100644 --- a/libgomp/fortran.c +++ b/libgomp/fortran.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2007, 2008 Free Software Foundation, Inc. Contributed by Jakub Jelinek <jakub@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -33,11 +33,12 @@ #ifdef HAVE_ATTRIBUTE_ALIAS /* Use internal aliases if possible. */ -#define ULP STR1(__USER_LABEL_PREFIX__) -#define STR1(x) STR2(x) -#define STR2(x) #x +# define ULP STR1(__USER_LABEL_PREFIX__) +# define STR1(x) STR2(x) +# define STR2(x) #x # define ialias_redirect(fn) \ extern __typeof (fn) fn __asm__ (ULP "gomp_ialias_" #fn) attribute_hidden; +# ifndef LIBGOMP_GNU_SYMBOL_VERSIONING ialias_redirect (omp_init_lock) ialias_redirect (omp_init_nest_lock) ialias_redirect (omp_destroy_lock) @@ -48,6 +49,7 @@ ialias_redirect (omp_unset_lock) ialias_redirect (omp_unset_nest_lock) ialias_redirect (omp_test_lock) ialias_redirect (omp_test_nest_lock) +# endif ialias_redirect (omp_set_dynamic) ialias_redirect (omp_set_nested) ialias_redirect (omp_set_num_threads) @@ -60,30 +62,52 @@ ialias_redirect (omp_get_num_threads) ialias_redirect (omp_get_thread_num) ialias_redirect (omp_get_wtick) ialias_redirect (omp_get_wtime) -#endif +ialias_redirect (omp_set_schedule) +ialias_redirect (omp_get_schedule) +ialias_redirect (omp_get_thread_limit) +ialias_redirect (omp_set_max_active_levels) +ialias_redirect (omp_get_max_active_levels) +ialias_redirect (omp_get_level) +ialias_redirect (omp_get_ancestor_thread_num) +ialias_redirect (omp_get_team_size) +ialias_redirect (omp_get_active_level) +#endif + +#ifndef LIBGOMP_GNU_SYMBOL_VERSIONING +# define gomp_init_lock__30 omp_init_lock_ +# define gomp_destroy_lock__30 omp_destroy_lock_ +# define gomp_set_lock__30 omp_set_lock_ +# define gomp_unset_lock__30 omp_unset_lock_ +# define gomp_test_lock__30 omp_test_lock_ +# define gomp_init_nest_lock__30 omp_init_nest_lock_ +# define gomp_destroy_nest_lock__30 omp_destroy_nest_lock_ +# define gomp_set_nest_lock__30 omp_set_nest_lock_ +# define gomp_unset_nest_lock__30 omp_unset_nest_lock_ +# define gomp_test_nest_lock__30 omp_test_nest_lock_ +#endif void -omp_init_lock_ (omp_lock_arg_t lock) +gomp_init_lock__30 (omp_lock_arg_t lock) { #ifndef OMP_LOCK_DIRECT omp_lock_arg (lock) = malloc (sizeof (omp_lock_t)); #endif - omp_init_lock (omp_lock_arg (lock)); + gomp_init_lock_30 (omp_lock_arg (lock)); } void -omp_init_nest_lock_ (omp_nest_lock_arg_t lock) +gomp_init_nest_lock__30 (omp_nest_lock_arg_t lock) { #ifndef OMP_NEST_LOCK_DIRECT omp_nest_lock_arg (lock) = malloc (sizeof (omp_nest_lock_t)); #endif - omp_init_nest_lock (omp_nest_lock_arg (lock)); + gomp_init_nest_lock_30 (omp_nest_lock_arg (lock)); } void -omp_destroy_lock_ (omp_lock_arg_t lock) +gomp_destroy_lock__30 (omp_lock_arg_t lock) { - omp_destroy_lock (omp_lock_arg (lock)); + gomp_destroy_lock_30 (omp_lock_arg (lock)); #ifndef OMP_LOCK_DIRECT free (omp_lock_arg (lock)); omp_lock_arg (lock) = NULL; @@ -91,9 +115,9 @@ omp_destroy_lock_ (omp_lock_arg_t lock) } void -omp_destroy_nest_lock_ (omp_nest_lock_arg_t lock) +gomp_destroy_nest_lock__30 (omp_nest_lock_arg_t lock) { - omp_destroy_nest_lock (omp_nest_lock_arg (lock)); + gomp_destroy_nest_lock_30 (omp_nest_lock_arg (lock)); #ifndef OMP_NEST_LOCK_DIRECT free (omp_nest_lock_arg (lock)); omp_nest_lock_arg (lock) = NULL; @@ -101,30 +125,129 @@ omp_destroy_nest_lock_ (omp_nest_lock_arg_t lock) } void -omp_set_lock_ (omp_lock_arg_t lock) +gomp_set_lock__30 (omp_lock_arg_t lock) +{ + gomp_set_lock_30 (omp_lock_arg (lock)); +} + +void +gomp_set_nest_lock__30 (omp_nest_lock_arg_t lock) { - omp_set_lock (omp_lock_arg (lock)); + gomp_set_nest_lock_30 (omp_nest_lock_arg (lock)); } void -omp_set_nest_lock_ (omp_nest_lock_arg_t lock) +gomp_unset_lock__30 (omp_lock_arg_t lock) { - omp_set_nest_lock (omp_nest_lock_arg (lock)); + gomp_unset_lock_30 (omp_lock_arg (lock)); } void -omp_unset_lock_ (omp_lock_arg_t lock) +gomp_unset_nest_lock__30 (omp_nest_lock_arg_t lock) +{ + gomp_unset_nest_lock_30 (omp_nest_lock_arg (lock)); +} + +int32_t +gomp_test_lock__30 (omp_lock_arg_t lock) +{ + return gomp_test_lock_30 (omp_lock_arg (lock)); +} + +int32_t +gomp_test_nest_lock__30 (omp_nest_lock_arg_t lock) { - omp_unset_lock (omp_lock_arg (lock)); + return gomp_test_nest_lock_30 (omp_nest_lock_arg (lock)); } +#ifdef LIBGOMP_GNU_SYMBOL_VERSIONING void -omp_unset_nest_lock_ (omp_nest_lock_arg_t lock) +gomp_init_lock__25 (omp_lock_25_arg_t lock) { - omp_unset_nest_lock (omp_nest_lock_arg (lock)); +#ifndef OMP_LOCK_25_DIRECT + omp_lock_25_arg (lock) = malloc (sizeof (omp_lock_25_t)); +#endif + gomp_init_lock_25 (omp_lock_25_arg (lock)); +} + +void +gomp_init_nest_lock__25 (omp_nest_lock_25_arg_t lock) +{ +#ifndef OMP_NEST_LOCK_25_DIRECT + omp_nest_lock_25_arg (lock) = malloc (sizeof (omp_nest_lock_25_t)); +#endif + gomp_init_nest_lock_25 (omp_nest_lock_25_arg (lock)); +} + +void +gomp_destroy_lock__25 (omp_lock_25_arg_t lock) +{ + gomp_destroy_lock_25 (omp_lock_25_arg (lock)); +#ifndef OMP_LOCK_25_DIRECT + free (omp_lock_25_arg (lock)); + omp_lock_25_arg (lock) = NULL; +#endif +} + +void +gomp_destroy_nest_lock__25 (omp_nest_lock_25_arg_t lock) +{ + gomp_destroy_nest_lock_25 (omp_nest_lock_25_arg (lock)); +#ifndef OMP_NEST_LOCK_25_DIRECT + free (omp_nest_lock_25_arg (lock)); + omp_nest_lock_25_arg (lock) = NULL; +#endif } void +gomp_set_lock__25 (omp_lock_25_arg_t lock) +{ + gomp_set_lock_25 (omp_lock_25_arg (lock)); +} + +void +gomp_set_nest_lock__25 (omp_nest_lock_25_arg_t lock) +{ + gomp_set_nest_lock_25 (omp_nest_lock_25_arg (lock)); +} + +void +gomp_unset_lock__25 (omp_lock_25_arg_t lock) +{ + gomp_unset_lock_25 (omp_lock_25_arg (lock)); +} + +void +gomp_unset_nest_lock__25 (omp_nest_lock_25_arg_t lock) +{ + gomp_unset_nest_lock_25 (omp_nest_lock_25_arg (lock)); +} + +int32_t +gomp_test_lock__25 (omp_lock_25_arg_t lock) +{ + return gomp_test_lock_25 (omp_lock_25_arg (lock)); +} + +int32_t +gomp_test_nest_lock__25 (omp_nest_lock_25_arg_t lock) +{ + return gomp_test_nest_lock_25 (omp_nest_lock_25_arg (lock)); +} + +omp_lock_symver (omp_init_lock_) +omp_lock_symver (omp_destroy_lock_) +omp_lock_symver (omp_set_lock_) +omp_lock_symver (omp_unset_lock_) +omp_lock_symver (omp_test_lock_) +omp_lock_symver (omp_init_nest_lock_) +omp_lock_symver (omp_destroy_nest_lock_) +omp_lock_symver (omp_set_nest_lock_) +omp_lock_symver (omp_unset_nest_lock_) +omp_lock_symver (omp_test_nest_lock_) +#endif + +void omp_set_dynamic_ (const int32_t *set) { omp_set_dynamic (*set); @@ -179,12 +302,6 @@ omp_in_parallel_ (void) } int32_t -omp_test_lock_ (omp_lock_arg_t lock) -{ - return omp_test_lock (omp_lock_arg (lock)); -} - -int32_t omp_get_max_threads_ (void) { return omp_get_max_threads (); @@ -208,12 +325,6 @@ omp_get_thread_num_ (void) return omp_get_thread_num (); } -int32_t -omp_test_nest_lock_ (omp_nest_lock_arg_t lock) -{ - return omp_test_nest_lock (omp_nest_lock_arg (lock)); -} - double omp_get_wtick_ (void) { @@ -225,3 +336,95 @@ omp_get_wtime_ (void) { return omp_get_wtime (); } + +void +omp_set_schedule_ (const int32_t *kind, const int32_t *modifier) +{ + omp_set_schedule (*kind, *modifier); +} + +void +omp_set_schedule_8_ (const int32_t *kind, const int64_t *modifier) +{ + omp_set_schedule (*kind, *modifier); +} + +void +omp_get_schedule_ (int32_t *kind, int32_t *modifier) +{ + omp_sched_t k; + int m; + omp_get_schedule (&k, &m); + *kind = k; + *modifier = m; +} + +void +omp_get_schedule_8_ (int32_t *kind, int64_t *modifier) +{ + omp_sched_t k; + int m; + omp_get_schedule (&k, &m); + *kind = k; + *modifier = m; +} + +int32_t +omp_get_thread_limit_ (void) +{ + return omp_get_thread_limit (); +} + +void +omp_set_max_active_levels_ (const int32_t *levels) +{ + omp_set_max_active_levels (*levels); +} + +void +omp_set_max_active_levels_8_ (const int64_t *levels) +{ + omp_set_max_active_levels (*levels); +} + +int32_t +omp_get_max_active_levels_ (void) +{ + return omp_get_max_active_levels (); +} + +int32_t +omp_get_level_ (void) +{ + return omp_get_level (); +} + +int32_t +omp_get_ancestor_thread_num_ (const int32_t *level) +{ + return omp_get_ancestor_thread_num (*level); +} + +int32_t +omp_get_ancestor_thread_num_8_ (const int64_t *level) +{ + return omp_get_ancestor_thread_num (*level); +} + +int32_t +omp_get_team_size_ (const int32_t *level) +{ + return omp_get_team_size (*level); +} + +int32_t +omp_get_team_size_8_ (const int64_t *level) +{ + return omp_get_team_size (*level); +} + +int32_t +omp_get_active_level_ (void) +{ + return omp_get_active_level (); +} diff --git a/libgomp/iter.c b/libgomp/iter.c index 2d5dd2edd5a..f186058be46 100644 --- a/libgomp/iter.c +++ b/libgomp/iter.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -154,7 +154,7 @@ gomp_iter_dynamic_next_locked (long *pstart, long *pend) if (start == ws->end) return false; - chunk = ws->chunk_size * ws->incr; + chunk = ws->chunk_size; left = ws->end - start; if (ws->incr < 0) { @@ -186,11 +186,38 @@ gomp_iter_dynamic_next (long *pstart, long *pend) struct gomp_work_share *ws = thr->ts.work_share; long start, end, nend, chunk, incr; - start = ws->next; end = ws->end; incr = ws->incr; - chunk = ws->chunk_size * incr; + chunk = ws->chunk_size; + + if (__builtin_expect (ws->mode, 1)) + { + long tmp = __sync_fetch_and_add (&ws->next, chunk); + if (incr > 0) + { + if (tmp >= end) + return false; + nend = tmp + chunk; + if (nend > end) + nend = end; + *pstart = tmp; + *pend = nend; + return true; + } + else + { + if (tmp <= end) + return false; + nend = tmp + chunk; + if (nend < end) + nend = end; + *pstart = tmp; + *pend = nend; + return true; + } + } + start = ws->next; while (1) { long left = end - start; diff --git a/libgomp/iter_ull.c b/libgomp/iter_ull.c new file mode 100644 index 00000000000..d6262dafee5 --- /dev/null +++ b/libgomp/iter_ull.c @@ -0,0 +1,344 @@ +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. + Contributed by Richard Henderson <rth@redhat.com>. + + This file is part of the GNU OpenMP Library (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU Lesser General Public License as published by + the Free Software Foundation; either version 2.1 of the License, or + (at your option) any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for + more details. + + You should have received a copy of the GNU Lesser General Public License + along with libgomp; see the file COPYING.LIB. If not, write to the + Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, + MA 02110-1301, USA. */ + +/* As a special exception, if you link this library with other files, some + of which are compiled with GCC, to produce an executable, this library + does not by itself cause the resulting executable to be covered by the + GNU General Public License. This exception does not however invalidate + any other reasons why the executable file might be covered by the GNU + General Public License. */ + +/* This file contains routines for managing work-share iteration, both + for loops and sections. */ + +#include "libgomp.h" +#include <stdlib.h> + +typedef unsigned long long gomp_ull; + +/* This function implements the STATIC scheduling method. The caller should + iterate *pstart <= x < *pend. Return zero if there are more iterations + to perform; nonzero if not. Return less than 0 if this thread had + received the absolutely last iteration. */ + +int +gomp_iter_ull_static_next (gomp_ull *pstart, gomp_ull *pend) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_team *team = thr->ts.team; + struct gomp_work_share *ws = thr->ts.work_share; + unsigned long nthreads = team ? team->nthreads : 1; + + if (thr->ts.static_trip == -1) + return -1; + + /* Quick test for degenerate teams and orphaned constructs. */ + if (nthreads == 1) + { + *pstart = ws->next_ull; + *pend = ws->end_ull; + thr->ts.static_trip = -1; + return ws->next_ull == ws->end_ull; + } + + /* We interpret chunk_size zero as "unspecified", which means that we + should break up the iterations such that each thread makes only one + trip through the outer loop. */ + if (ws->chunk_size_ull == 0) + { + gomp_ull n, q, i, s0, e0, s, e; + + if (thr->ts.static_trip > 0) + return 1; + + /* Compute the total number of iterations. */ + if (__builtin_expect (ws->mode, 0) == 0) + n = (ws->end_ull - ws->next_ull + ws->incr_ull - 1) / ws->incr_ull; + else + n = (ws->next_ull - ws->end_ull - ws->incr_ull - 1) / -ws->incr_ull; + i = thr->ts.team_id; + + /* Compute the "zero-based" start and end points. That is, as + if the loop began at zero and incremented by one. */ + q = n / nthreads; + q += (q * nthreads != n); + s0 = q * i; + e0 = s0 + q; + if (e0 > n) + e0 = n; + + /* Notice when no iterations allocated for this thread. */ + if (s0 >= e0) + { + thr->ts.static_trip = 1; + return 1; + } + + /* Transform these to the actual start and end numbers. */ + s = s0 * ws->incr_ull + ws->next_ull; + e = e0 * ws->incr_ull + ws->next_ull; + + *pstart = s; + *pend = e; + thr->ts.static_trip = (e0 == n ? -1 : 1); + return 0; + } + else + { + gomp_ull n, s0, e0, i, c, s, e; + + /* Otherwise, each thread gets exactly chunk_size iterations + (if available) each time through the loop. */ + + if (__builtin_expect (ws->mode, 0) == 0) + n = (ws->end_ull - ws->next_ull + ws->incr_ull - 1) / ws->incr_ull; + else + n = (ws->next_ull - ws->end_ull - ws->incr_ull - 1) / -ws->incr_ull; + i = thr->ts.team_id; + c = ws->chunk_size_ull; + + /* Initial guess is a C sized chunk positioned nthreads iterations + in, offset by our thread number. */ + s0 = (thr->ts.static_trip * (gomp_ull) nthreads + i) * c; + e0 = s0 + c; + + /* Detect overflow. */ + if (s0 >= n) + return 1; + if (e0 > n) + e0 = n; + + /* Transform these to the actual start and end numbers. */ + s = s0 * ws->incr_ull + ws->next_ull; + e = e0 * ws->incr_ull + ws->next_ull; + + *pstart = s; + *pend = e; + + if (e0 == n) + thr->ts.static_trip = -1; + else + thr->ts.static_trip++; + return 0; + } +} + + +/* This function implements the DYNAMIC scheduling method. Arguments are + as for gomp_iter_ull_static_next. This function must be called with + ws->lock held. */ + +bool +gomp_iter_ull_dynamic_next_locked (gomp_ull *pstart, gomp_ull *pend) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_work_share *ws = thr->ts.work_share; + gomp_ull start, end, chunk, left; + + start = ws->next_ull; + if (start == ws->end_ull) + return false; + + chunk = ws->chunk_size_ull; + left = ws->end_ull - start; + if (__builtin_expect (ws->mode & 2, 0)) + { + if (chunk < left) + chunk = left; + } + else + { + if (chunk > left) + chunk = left; + } + end = start + chunk; + + ws->next_ull = end; + *pstart = start; + *pend = end; + return true; +} + + +#if defined HAVE_SYNC_BUILTINS && defined __LP64__ +/* Similar, but doesn't require the lock held, and uses compare-and-swap + instead. Note that the only memory value that changes is ws->next_ull. */ + +bool +gomp_iter_ull_dynamic_next (gomp_ull *pstart, gomp_ull *pend) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_work_share *ws = thr->ts.work_share; + gomp_ull start, end, nend, chunk; + + end = ws->end_ull; + chunk = ws->chunk_size_ull; + + if (__builtin_expect (ws->mode & 1, 1)) + { + gomp_ull tmp = __sync_fetch_and_add (&ws->next_ull, chunk); + if (__builtin_expect (ws->mode & 2, 0) == 0) + { + if (tmp >= end) + return false; + nend = tmp + chunk; + if (nend > end) + nend = end; + *pstart = tmp; + *pend = nend; + return true; + } + else + { + if (tmp <= end) + return false; + nend = tmp + chunk; + if (nend < end) + nend = end; + *pstart = tmp; + *pend = nend; + return true; + } + } + + start = ws->next_ull; + while (1) + { + gomp_ull left = end - start; + gomp_ull tmp; + + if (start == end) + return false; + + if (__builtin_expect (ws->mode & 2, 0)) + { + if (chunk < left) + chunk = left; + } + else + { + if (chunk > left) + chunk = left; + } + nend = start + chunk; + + tmp = __sync_val_compare_and_swap (&ws->next_ull, start, nend); + if (__builtin_expect (tmp == start, 1)) + break; + + start = tmp; + } + + *pstart = start; + *pend = nend; + return true; +} +#endif /* HAVE_SYNC_BUILTINS */ + + +/* This function implements the GUIDED scheduling method. Arguments are + as for gomp_iter_ull_static_next. This function must be called with the + work share lock held. */ + +bool +gomp_iter_ull_guided_next_locked (gomp_ull *pstart, gomp_ull *pend) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_work_share *ws = thr->ts.work_share; + struct gomp_team *team = thr->ts.team; + gomp_ull nthreads = team ? team->nthreads : 1; + gomp_ull n, q; + gomp_ull start, end; + + if (ws->next_ull == ws->end_ull) + return false; + + start = ws->next_ull; + if (__builtin_expect (ws->mode, 0) == 0) + n = (ws->end_ull - start) / ws->incr_ull; + else + n = (start - ws->end_ull) / -ws->incr_ull; + q = (n + nthreads - 1) / nthreads; + + if (q < ws->chunk_size_ull) + q = ws->chunk_size_ull; + if (q <= n) + end = start + q * ws->incr_ull; + else + end = ws->end_ull; + + ws->next_ull = end; + *pstart = start; + *pend = end; + return true; +} + +#if defined HAVE_SYNC_BUILTINS && defined __LP64__ +/* Similar, but doesn't require the lock held, and uses compare-and-swap + instead. Note that the only memory value that changes is ws->next_ull. */ + +bool +gomp_iter_ull_guided_next (gomp_ull *pstart, gomp_ull *pend) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_work_share *ws = thr->ts.work_share; + struct gomp_team *team = thr->ts.team; + gomp_ull nthreads = team ? team->nthreads : 1; + gomp_ull start, end, nend, incr; + gomp_ull chunk_size; + + start = ws->next_ull; + end = ws->end_ull; + incr = ws->incr_ull; + chunk_size = ws->chunk_size_ull; + + while (1) + { + gomp_ull n, q; + gomp_ull tmp; + + if (start == end) + return false; + + if (__builtin_expect (ws->mode, 0) == 0) + n = (end - start) / incr; + else + n = (start - end) / -incr; + q = (n + nthreads - 1) / nthreads; + + if (q < chunk_size) + q = chunk_size; + if (__builtin_expect (q <= n, 1)) + nend = start + q * incr; + else + nend = end; + + tmp = __sync_val_compare_and_swap (&ws->next_ull, start, nend); + if (__builtin_expect (tmp == start, 1)) + break; + + start = tmp; + } + + *pstart = start; + *pend = nend; + return true; +} +#endif /* HAVE_SYNC_BUILTINS */ diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index 7075250a87f..66180122c1e 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2005, 2007 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2007, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -50,6 +50,7 @@ #include "sem.h" #include "mutex.h" #include "bar.h" +#include "ptrlock.h" /* This structure contains the data to control one work-sharing construct, @@ -57,10 +58,11 @@ enum gomp_schedule_type { + GFS_RUNTIME, GFS_STATIC, GFS_DYNAMIC, GFS_GUIDED, - GFS_RUNTIME + GFS_AUTO }; struct gomp_work_share @@ -70,56 +72,94 @@ struct gomp_work_share If this is a SECTIONS construct, this value will always be DYNAMIC. */ enum gomp_schedule_type sched; - /* This is the chunk_size argument to the SCHEDULE clause. */ - long chunk_size; + int mode; - /* This is the iteration end point. If this is a SECTIONS construct, - this is the number of contained sections. */ - long end; + union { + struct { + /* This is the chunk_size argument to the SCHEDULE clause. */ + long chunk_size; + + /* This is the iteration end point. If this is a SECTIONS construct, + this is the number of contained sections. */ + long end; + + /* This is the iteration step. If this is a SECTIONS construct, this + is always 1. */ + long incr; + }; + + struct { + /* The same as above, but for the unsigned long long loop variants. */ + unsigned long long chunk_size_ull; + unsigned long long end_ull; + unsigned long long incr_ull; + }; + }; + + /* This is a circular queue that details which threads will be allowed + into the ordered region and in which order. When a thread allocates + iterations on which it is going to work, it also registers itself at + the end of the array. When a thread reaches the ordered region, it + checks to see if it is the one at the head of the queue. If not, it + blocks on its RELEASE semaphore. */ + unsigned *ordered_team_ids; + + /* This is the number of threads that have registered themselves in + the circular queue ordered_team_ids. */ + unsigned ordered_num_used; + + /* This is the team_id of the currently acknowledged owner of the ordered + section, or -1u if the ordered section has not been acknowledged by + any thread. This is distinguished from the thread that is *allowed* + to take the section next. */ + unsigned ordered_owner; + + /* This is the index into the circular queue ordered_team_ids of the + current thread that's allowed into the ordered reason. */ + unsigned ordered_cur; - /* This is the iteration step. If this is a SECTIONS construct, this - is always 1. */ - long incr; + /* This is a chain of allocated gomp_work_share blocks, valid only + in the first gomp_work_share struct in the block. */ + struct gomp_work_share *next_alloc; + + /* The above fields are written once during workshare initialization, + or related to ordered worksharing. Make sure the following fields + are in a different cache line. */ /* This lock protects the update of the following members. */ - gomp_mutex_t lock; + gomp_mutex_t lock __attribute__((aligned (64))); + + /* This is the count of the number of threads that have exited the work + share construct. If the construct was marked nowait, they have moved on + to other work; otherwise they're blocked on a barrier. The last member + of the team to exit the work share construct must deallocate it. */ + unsigned threads_completed; union { /* This is the next iteration value to be allocated. In the case of GFS_STATIC loops, this the iteration start point and never changes. */ long next; + /* The same, but with unsigned long long type. */ + unsigned long long next_ull; + /* This is the returned data structure for SINGLE COPYPRIVATE. */ void *copyprivate; }; - /* This is the count of the number of threads that have exited the work - share construct. If the construct was marked nowait, they have moved on - to other work; otherwise they're blocked on a barrier. The last member - of the team to exit the work share construct must deallocate it. */ - unsigned threads_completed; - - /* This is the index into the circular queue ordered_team_ids of the - current thread that's allowed into the ordered reason. */ - unsigned ordered_cur; - - /* This is the number of threads that have registered themselves in - the circular queue ordered_team_ids. */ - unsigned ordered_num_used; + union { + /* Link to gomp_work_share struct for next work sharing construct + encountered after this one. */ + gomp_ptrlock_t next_ws; - /* This is the team_id of the currently acknoledged owner of the ordered - section, or -1u if the ordered section has not been acknowledged by - any thread. This is distinguished from the thread that is *allowed* - to take the section next. */ - unsigned ordered_owner; + /* gomp_work_share structs are chained in the free work share cache + through this. */ + struct gomp_work_share *next_free; + }; - /* This is a circular queue that details which threads will be allowed - into the ordered region and in which order. When a thread allocates - iterations on which it is going to work, it also registers itself at - the end of the array. When a thread reaches the ordered region, it - checks to see if it is the one at the head of the queue. If not, it - blocks on its RELEASE semaphore. */ - unsigned ordered_team_ids[]; + /* If only few threads are in the team, ordered_team_ids can point + to this array which fills the padding at the end of this struct. */ + unsigned inline_ordered_team_ids[0]; }; /* This structure contains all of the thread-local data associated with @@ -133,21 +173,30 @@ struct gomp_team_state /* This is the work share construct which this thread is currently processing. Recall that with NOWAIT, not all threads may be - processing the same construct. This value is NULL when there - is no construct being processed. */ + processing the same construct. */ struct gomp_work_share *work_share; + /* This is the previous work share construct or NULL if there wasn't any. + When all threads are done with the current work sharing construct, + the previous one can be freed. The current one can't, as its + next_ws field is used. */ + struct gomp_work_share *last_work_share; + /* This is the ID of this thread within the team. This value is guaranteed to be between 0 and N-1, where N is the number of threads in the team. */ unsigned team_id; - /* The work share "generation" is a number that increases by one for - each work share construct encountered in the dynamic flow of the - program. It is used to find the control data for the work share - when encountering it for the first time. This particular number - reflects the generation of the work_share member of this struct. */ - unsigned work_share_generation; + /* Nesting level. */ + unsigned level; + + /* Active nesting level. Only active parallel regions are counted. */ + unsigned active_level; + +#ifdef HAVE_SYNC_BUILTINS + /* Number of single stmts encountered. */ + unsigned long single_count; +#endif /* For GFS_RUNTIME loops that resolved to GFS_STATIC, this is the trip number through the loop. So first time a particular loop @@ -157,48 +206,118 @@ struct gomp_team_state unsigned long static_trip; }; -/* This structure describes a "team" of threads. These are the threads - that are spawned by a PARALLEL constructs, as well as the work sharing - constructs that the team encounters. */ +/* These are the OpenMP 3.0 Internal Control Variables described in + section 2.3.1. Those described as having one copy per task are + stored within the structure; those described as having one copy + for the whole program are (naturally) global variables. */ -struct gomp_team +struct gomp_task_icv { - /* This lock protects access to the following work shares data structures. */ - gomp_mutex_t work_share_lock; + unsigned long nthreads_var; + enum gomp_schedule_type run_sched_var; + int run_sched_modifier; + bool dyn_var; + bool nest_var; +}; - /* This is a dynamically sized array containing pointers to the control - structs for all "live" work share constructs. Here "live" means that - the construct has been encountered by at least one thread, and not - completed by all threads. */ - struct gomp_work_share **work_shares; +extern struct gomp_task_icv gomp_global_icv; +extern unsigned long gomp_thread_limit_var; +extern unsigned long gomp_remaining_threads_count; +#ifndef HAVE_SYNC_BUILTINS +extern gomp_mutex_t gomp_remaining_threads_lock; +#endif +extern unsigned long gomp_max_active_levels_var; +extern unsigned long long gomp_spin_count_var, gomp_throttled_spin_count_var; +extern unsigned long gomp_available_cpus, gomp_managed_threads; - /* The work_shares array is indexed by "generation & generation_mask". - The mask will be 2**N - 1, where 2**N is the size of the array. */ - unsigned generation_mask; +enum gomp_task_kind +{ + GOMP_TASK_IMPLICIT, + GOMP_TASK_IFFALSE, + GOMP_TASK_WAITING, + GOMP_TASK_TIED +}; - /* These two values define the bounds of the elements of the work_shares - array that are currently in use. */ - unsigned oldest_live_gen; - unsigned num_live_gen; +/* This structure describes a "task" to be run by a thread. */ +struct gomp_task +{ + struct gomp_task *parent; + struct gomp_task *children; + struct gomp_task *next_child; + struct gomp_task *prev_child; + struct gomp_task *next_queue; + struct gomp_task *prev_queue; + struct gomp_task_icv icv; + void (*fn) (void *); + void *fn_data; + enum gomp_task_kind kind; + bool in_taskwait; + gomp_sem_t taskwait_sem; +}; + +/* This structure describes a "team" of threads. These are the threads + that are spawned by a PARALLEL constructs, as well as the work sharing + constructs that the team encounters. */ + +struct gomp_team +{ /* This is the number of threads in the current team. */ unsigned nthreads; + /* This is number of gomp_work_share structs that have been allocated + as a block last time. */ + unsigned work_share_chunk; + /* This is the saved team state that applied to a master thread before the current thread was created. */ struct gomp_team_state prev_ts; - /* This barrier is used for most synchronization of the team. */ - gomp_barrier_t barrier; - /* This semaphore should be used by the master thread instead of its "native" semaphore in the thread structure. Required for nested parallels, as the master is a member of two teams. */ gomp_sem_t master_release; - /* This array contains pointers to the release semaphore of the threads - in the team. */ - gomp_sem_t *ordered_release[]; + /* This points to an array with pointers to the release semaphore + of the threads in the team. */ + gomp_sem_t **ordered_release; + + /* List of gomp_work_share structs chained through next_free fields. + This is populated and taken off only by the first thread in the + team encountering a new work sharing construct, in a critical + section. */ + struct gomp_work_share *work_share_list_alloc; + + /* List of gomp_work_share structs freed by free_work_share. New + entries are atomically added to the start of the list, and + alloc_work_share can safely only move all but the first entry + to work_share_list alloc, as free_work_share can happen concurrently + with alloc_work_share. */ + struct gomp_work_share *work_share_list_free; + +#ifdef HAVE_SYNC_BUILTINS + /* Number of simple single regions encountered by threads in this + team. */ + unsigned long single_count; +#else + /* Mutex protecting addition of workshares to work_share_list_free. */ + gomp_mutex_t work_share_list_free_lock; +#endif + + /* This barrier is used for most synchronization of the team. */ + gomp_barrier_t barrier; + + /* Initial work shares, to avoid allocating any gomp_work_share + structs in the common case. */ + struct gomp_work_share work_shares[8]; + + gomp_mutex_t task_lock; + struct gomp_task *task_queue; + int task_count; + int task_running_count; + + /* This array contains structures for implicit tasks. */ + struct gomp_task implicit_task[]; }; /* This structure contains all data that is private to libgomp and is @@ -214,8 +333,28 @@ struct gomp_thread is NULL only if the thread is idle. */ struct gomp_team_state ts; + /* This is the task that the thread is currently executing. */ + struct gomp_task *task; + /* This semaphore is used for ordered loops. */ gomp_sem_t release; + + /* user pthread thread pool */ + struct gomp_thread_pool *thread_pool; +}; + + +struct gomp_thread_pool +{ + /* This array manages threads spawned from the top level, which will + return to the idle loop once the current PARALLEL construct ends. */ + struct gomp_thread **threads; + unsigned threads_size; + unsigned threads_used; + struct gomp_team *last_team; + + /* This barrier holds and releases threads waiting in threads. */ + gomp_barrier_t threads_dock; }; /* ... and here is that TLS data. */ @@ -234,14 +373,20 @@ static inline struct gomp_thread *gomp_thread (void) } #endif -/* These are the OpenMP 2.5 internal control variables described in - section 2.3. At least those that correspond to environment variables. */ +extern struct gomp_task_icv *gomp_new_icv (void); + +/* Here's how to access the current copy of the ICVs. */ -extern unsigned long gomp_nthreads_var; -extern bool gomp_dyn_var; -extern bool gomp_nest_var; -extern enum gomp_schedule_type gomp_run_sched_var; -extern unsigned long gomp_run_sched_chunk; +static inline struct gomp_task_icv *gomp_icv (bool write) +{ + struct gomp_task *task = gomp_thread ()->task; + if (task) + return &task->icv; + else if (write) + return gomp_new_icv (); + else + return &gomp_global_icv; +} /* The attributes to be used during thread creation. */ extern pthread_attr_t gomp_thread_attr; @@ -286,6 +431,22 @@ extern bool gomp_iter_dynamic_next (long *, long *); extern bool gomp_iter_guided_next (long *, long *); #endif +/* iter_ull.c */ + +extern int gomp_iter_ull_static_next (unsigned long long *, + unsigned long long *); +extern bool gomp_iter_ull_dynamic_next_locked (unsigned long long *, + unsigned long long *); +extern bool gomp_iter_ull_guided_next_locked (unsigned long long *, + unsigned long long *); + +#if defined HAVE_SYNC_BUILTINS && defined __LP64__ +extern bool gomp_iter_ull_dynamic_next (unsigned long long *, + unsigned long long *); +extern bool gomp_iter_ull_guided_next (unsigned long long *, + unsigned long long *); +#endif + /* ordered.c */ extern void gomp_ordered_first (void); @@ -297,26 +458,49 @@ extern void gomp_ordered_sync (void); /* parallel.c */ -extern unsigned gomp_resolve_num_threads (unsigned); +extern unsigned gomp_resolve_num_threads (unsigned, unsigned); /* proc.c (in config/) */ extern void gomp_init_num_threads (void); extern unsigned gomp_dynamic_max_threads (void); +/* task.c */ + +extern void gomp_init_task (struct gomp_task *, struct gomp_task *, + struct gomp_task_icv *); +extern void gomp_end_task (void); +extern void gomp_barrier_handle_tasks (gomp_barrier_state_t); + +static void inline +gomp_finish_task (struct gomp_task *task) +{ + gomp_sem_destroy (&task->taskwait_sem); +} + /* team.c */ +extern struct gomp_team *gomp_new_team (unsigned); extern void gomp_team_start (void (*) (void *), void *, unsigned, - struct gomp_work_share *); + struct gomp_team *); extern void gomp_team_end (void); /* work.c */ -extern struct gomp_work_share * gomp_new_work_share (bool, unsigned); +extern void gomp_init_work_share (struct gomp_work_share *, bool, unsigned); +extern void gomp_fini_work_share (struct gomp_work_share *); extern bool gomp_work_share_start (bool); extern void gomp_work_share_end (void); extern void gomp_work_share_end_nowait (void); +static inline void +gomp_work_share_init_done (void) +{ + struct gomp_thread *thr = gomp_thread (); + if (__builtin_expect (thr->ts.last_work_share != NULL, 1)) + gomp_ptrlock_set (&thr->ts.last_work_share->next_ws, thr->ts.work_share); +} + #ifdef HAVE_ATTRIBUTE_VISIBILITY # pragma GCC visibility pop #endif @@ -329,6 +513,53 @@ extern void gomp_work_share_end_nowait (void); #define _LIBGOMP_OMP_LOCK_DEFINED 1 #include "omp.h.in" +#if !defined (HAVE_ATTRIBUTE_VISIBILITY) \ + || !defined (HAVE_ATTRIBUTE_ALIAS) \ + || !defined (PIC) +# undef LIBGOMP_GNU_SYMBOL_VERSIONING +#endif + +#ifdef LIBGOMP_GNU_SYMBOL_VERSIONING +extern void gomp_init_lock_30 (omp_lock_t *) __GOMP_NOTHROW; +extern void gomp_destroy_lock_30 (omp_lock_t *) __GOMP_NOTHROW; +extern void gomp_set_lock_30 (omp_lock_t *) __GOMP_NOTHROW; +extern void gomp_unset_lock_30 (omp_lock_t *) __GOMP_NOTHROW; +extern int gomp_test_lock_30 (omp_lock_t *) __GOMP_NOTHROW; +extern void gomp_init_nest_lock_30 (omp_nest_lock_t *) __GOMP_NOTHROW; +extern void gomp_destroy_nest_lock_30 (omp_nest_lock_t *) __GOMP_NOTHROW; +extern void gomp_set_nest_lock_30 (omp_nest_lock_t *) __GOMP_NOTHROW; +extern void gomp_unset_nest_lock_30 (omp_nest_lock_t *) __GOMP_NOTHROW; +extern int gomp_test_nest_lock_30 (omp_nest_lock_t *) __GOMP_NOTHROW; + +extern void gomp_init_lock_25 (omp_lock_25_t *) __GOMP_NOTHROW; +extern void gomp_destroy_lock_25 (omp_lock_25_t *) __GOMP_NOTHROW; +extern void gomp_set_lock_25 (omp_lock_25_t *) __GOMP_NOTHROW; +extern void gomp_unset_lock_25 (omp_lock_25_t *) __GOMP_NOTHROW; +extern int gomp_test_lock_25 (omp_lock_25_t *) __GOMP_NOTHROW; +extern void gomp_init_nest_lock_25 (omp_nest_lock_25_t *) __GOMP_NOTHROW; +extern void gomp_destroy_nest_lock_25 (omp_nest_lock_25_t *) __GOMP_NOTHROW; +extern void gomp_set_nest_lock_25 (omp_nest_lock_25_t *) __GOMP_NOTHROW; +extern void gomp_unset_nest_lock_25 (omp_nest_lock_25_t *) __GOMP_NOTHROW; +extern int gomp_test_nest_lock_25 (omp_nest_lock_25_t *) __GOMP_NOTHROW; + +# define strong_alias(fn, al) \ + extern __typeof (fn) al __attribute__ ((alias (#fn))); +# define omp_lock_symver(fn) \ + __asm (".symver g" #fn "_30, " #fn "@@OMP_3.0"); \ + __asm (".symver g" #fn "_25, " #fn "@OMP_1.0"); +#else +# define gomp_init_lock_30 omp_init_lock +# define gomp_destroy_lock_30 omp_destroy_lock +# define gomp_set_lock_30 omp_set_lock +# define gomp_unset_lock_30 omp_unset_lock +# define gomp_test_lock_30 omp_test_lock +# define gomp_init_nest_lock_30 omp_init_nest_lock +# define gomp_destroy_nest_lock_30 omp_destroy_nest_lock +# define gomp_set_nest_lock_30 omp_set_nest_lock +# define gomp_unset_nest_lock_30 omp_unset_nest_lock +# define gomp_test_nest_lock_30 omp_test_nest_lock +#endif + #ifdef HAVE_ATTRIBUTE_VISIBILITY # define attribute_hidden __attribute__ ((visibility ("hidden"))) #else diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index 9e13ef8116c..e6c12fa0019 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -55,6 +55,53 @@ OMP_2.0 { omp_get_wtime_; } OMP_1.0; +OMP_3.0 { + global: + omp_set_schedule; + omp_set_schedule_; + omp_set_schedule_8_; + omp_get_schedule; + omp_get_schedule_; + omp_get_schedule_8_; + omp_get_thread_limit; + omp_get_thread_limit_; + omp_set_max_active_levels; + omp_set_max_active_levels_; + omp_set_max_active_levels_8_; + omp_get_max_active_levels; + omp_get_max_active_levels_; + omp_get_level; + omp_get_level_; + omp_get_ancestor_thread_num; + omp_get_ancestor_thread_num_; + omp_get_ancestor_thread_num_8_; + omp_get_team_size; + omp_get_team_size_; + omp_get_team_size_8_; + omp_get_active_level; + omp_get_active_level_; + omp_init_lock; + omp_init_nest_lock; + omp_destroy_lock; + omp_destroy_nest_lock; + omp_set_lock; + omp_set_nest_lock; + omp_unset_lock; + omp_unset_nest_lock; + omp_test_lock; + omp_test_nest_lock; + omp_destroy_lock_; + omp_destroy_nest_lock_; + omp_init_lock_; + omp_init_nest_lock_; + omp_set_lock_; + omp_set_nest_lock_; + omp_test_lock_; + omp_test_nest_lock_; + omp_unset_lock_; + omp_unset_nest_lock_; +} OMP_2.0; + GOMP_1.0 { global: GOMP_atomic_end; @@ -70,16 +117,12 @@ GOMP_1.0 { GOMP_loop_end_nowait; GOMP_loop_guided_next; GOMP_loop_guided_start; - GOMP_loop_ordered_dynamic_first; GOMP_loop_ordered_dynamic_next; GOMP_loop_ordered_dynamic_start; - GOMP_loop_ordered_guided_first; GOMP_loop_ordered_guided_next; GOMP_loop_ordered_guided_start; - GOMP_loop_ordered_runtime_first; GOMP_loop_ordered_runtime_next; GOMP_loop_ordered_runtime_start; - GOMP_loop_ordered_static_first; GOMP_loop_ordered_static_next; GOMP_loop_ordered_static_start; GOMP_loop_runtime_next; @@ -103,3 +146,25 @@ GOMP_1.0 { GOMP_single_copy_start; GOMP_single_start; }; + +GOMP_2.0 { + global: + GOMP_task; + GOMP_taskwait; + GOMP_loop_ull_dynamic_next; + GOMP_loop_ull_dynamic_start; + GOMP_loop_ull_guided_next; + GOMP_loop_ull_guided_start; + GOMP_loop_ull_ordered_dynamic_next; + GOMP_loop_ull_ordered_dynamic_start; + GOMP_loop_ull_ordered_guided_next; + GOMP_loop_ull_ordered_guided_start; + GOMP_loop_ull_ordered_runtime_next; + GOMP_loop_ull_ordered_runtime_start; + GOMP_loop_ull_ordered_static_next; + GOMP_loop_ull_ordered_static_start; + GOMP_loop_ull_runtime_next; + GOMP_loop_ull_runtime_start; + GOMP_loop_ull_static_next; + GOMP_loop_ull_static_start; +} GOMP_1.0; diff --git a/libgomp/libgomp_f.h.in b/libgomp/libgomp_f.h.in index 85543565a1e..ecd92a8060e 100644 --- a/libgomp/libgomp_f.h.in +++ b/libgomp/libgomp_f.h.in @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Jakub Jelinek <jakub@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -53,6 +53,26 @@ typedef union { omp_nest_lock_t *lock; uint64_t u; } *omp_nest_lock_arg_t; # define omp_nest_lock_arg(arg) ((arg)->lock) # endif +#if (@OMP_LOCK_25_SIZE@ == @OMP_LOCK_25_KIND@) \ + && (@OMP_LOCK_25_ALIGN@ <= @OMP_LOCK_25_SIZE@) +# define OMP_LOCK_25_DIRECT +typedef omp_lock_25_t *omp_lock_25_arg_t; +# define omp_lock_25_arg(arg) (arg) +#else +typedef union { omp_lock_25_t *lock; uint64_t u; } *omp_lock_25_arg_t; +# define omp_lock_25_arg(arg) ((arg)->lock) +# endif + +#if (@OMP_NEST_LOCK_25_SIZE@ == @OMP_NEST_LOCK_25_KIND@) \ + && (@OMP_NEST_LOCK_25_ALIGN@ <= @OMP_NEST_LOCK_25_SIZE@) +# define OMP_NEST_LOCK_25_DIRECT +typedef omp_nest_lock_25_t *omp_nest_lock_25_arg_t; +# define omp_nest_lock_25_arg(arg) (arg) +#else +typedef union { omp_nest_lock_25_t *lock; uint64_t u; } *omp_nest_lock_25_arg_t; +# define omp_nest_lock_25_arg(arg) ((arg)->lock) +# endif + static inline void omp_check_defines (void) { @@ -63,6 +83,14 @@ omp_check_defines (void) || @OMP_LOCK_KIND@ != sizeof (*(omp_lock_arg_t) 0) || @OMP_NEST_LOCK_KIND@ != sizeof (*(omp_nest_lock_arg_t) 0)) ? -1 : 1] __attribute__ ((__unused__)); + char test2[(@OMP_LOCK_25_SIZE@ != sizeof (omp_lock_25_t) + || @OMP_LOCK_25_ALIGN@ != __alignof (omp_lock_25_t) + || @OMP_NEST_LOCK_25_SIZE@ != sizeof (omp_nest_lock_25_t) + || @OMP_NEST_LOCK_25_ALIGN@ != __alignof (omp_nest_lock_25_t) + || @OMP_LOCK_25_KIND@ != sizeof (*(omp_lock_25_arg_t) 0) + || @OMP_NEST_LOCK_25_KIND@ + != sizeof (*(omp_nest_lock_25_arg_t) 0)) + ? -1 : 1] __attribute__ ((__unused__)); } #endif /* LIBGOMP_F_H */ diff --git a/libgomp/libgomp_g.h b/libgomp/libgomp_g.h index 52ecafdcf0f..322fd4f23d1 100644 --- a/libgomp/libgomp_g.h +++ b/libgomp/libgomp_g.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2007, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -83,6 +83,74 @@ extern void GOMP_parallel_loop_runtime_start (void (*)(void *), void *, extern void GOMP_loop_end (void); extern void GOMP_loop_end_nowait (void); +/* loop_ull.c */ + +extern bool GOMP_loop_ull_static_start (bool, unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_dynamic_start (bool, unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_guided_start (bool, unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_runtime_start (bool, unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long *, + unsigned long long *); + +extern bool GOMP_loop_ull_ordered_static_start (bool, unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_ordered_dynamic_start (bool, unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_ordered_guided_start (bool, unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_ordered_runtime_start (bool, unsigned long long, + unsigned long long, + unsigned long long, + unsigned long long *, + unsigned long long *); + +extern bool GOMP_loop_ull_static_next (unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_dynamic_next (unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_guided_next (unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_runtime_next (unsigned long long *, + unsigned long long *); + +extern bool GOMP_loop_ull_ordered_static_next (unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_ordered_dynamic_next (unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_ordered_guided_next (unsigned long long *, + unsigned long long *); +extern bool GOMP_loop_ull_ordered_runtime_next (unsigned long long *, + unsigned long long *); + /* ordered.c */ extern void GOMP_ordered_start (void); @@ -93,6 +161,12 @@ extern void GOMP_ordered_end (void); extern void GOMP_parallel_start (void (*) (void *), void *, unsigned); extern void GOMP_parallel_end (void); +/* team.c */ + +extern void GOMP_task (void (*) (void *), void *, void (*) (void *, void *), + long, long, bool, unsigned); +extern void GOMP_taskwait (void); + /* sections.c */ extern unsigned GOMP_sections_start (unsigned); diff --git a/libgomp/loop.c b/libgomp/loop.c index 58fd9a8af28..1cea334bcbf 100644 --- a/libgomp/loop.c +++ b/libgomp/loop.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -27,8 +27,9 @@ /* This file handles the LOOP (FOR/DO) construct. */ -#include "libgomp.h" +#include <limits.h> #include <stdlib.h> +#include "libgomp.h" /* Initialize the given work share construct from the given arguments. */ @@ -44,6 +45,39 @@ gomp_loop_init (struct gomp_work_share *ws, long start, long end, long incr, ? start : end; ws->incr = incr; ws->next = start; + if (sched == GFS_DYNAMIC) + { + ws->chunk_size *= incr; + +#ifdef HAVE_SYNC_BUILTINS + { + /* For dynamic scheduling prepare things to make each iteration + faster. */ + struct gomp_thread *thr = gomp_thread (); + struct gomp_team *team = thr->ts.team; + long nthreads = team ? team->nthreads : 1; + + if (__builtin_expect (incr > 0, 1)) + { + /* Cheap overflow protection. */ + if (__builtin_expect ((nthreads | ws->chunk_size) + >= 1UL << (sizeof (long) + * __CHAR_BIT__ / 2 - 1), 0)) + ws->mode = 0; + else + ws->mode = ws->end < (LONG_MAX + - (nthreads + 1) * ws->chunk_size); + } + /* Cheap overflow protection. */ + else if (__builtin_expect ((nthreads | -ws->chunk_size) + >= 1UL << (sizeof (long) + * __CHAR_BIT__ / 2 - 1), 0)) + ws->mode = 0; + else + ws->mode = ws->end > (nthreads + 1) * -ws->chunk_size - LONG_MAX; + } +#endif + } } /* The *_start routines are called when first encountering a loop construct @@ -68,10 +102,13 @@ gomp_loop_static_start (long start, long end, long incr, long chunk_size, { struct gomp_thread *thr = gomp_thread (); + thr->ts.static_trip = 0; if (gomp_work_share_start (false)) - gomp_loop_init (thr->ts.work_share, start, end, incr, - GFS_STATIC, chunk_size); - gomp_mutex_unlock (&thr->ts.work_share->lock); + { + gomp_loop_init (thr->ts.work_share, start, end, incr, + GFS_STATIC, chunk_size); + gomp_work_share_init_done (); + } return !gomp_iter_static_next (istart, iend); } @@ -84,13 +121,16 @@ gomp_loop_dynamic_start (long start, long end, long incr, long chunk_size, bool ret; if (gomp_work_share_start (false)) - gomp_loop_init (thr->ts.work_share, start, end, incr, - GFS_DYNAMIC, chunk_size); + { + gomp_loop_init (thr->ts.work_share, start, end, incr, + GFS_DYNAMIC, chunk_size); + gomp_work_share_init_done (); + } #ifdef HAVE_SYNC_BUILTINS - gomp_mutex_unlock (&thr->ts.work_share->lock); ret = gomp_iter_dynamic_next (istart, iend); #else + gomp_mutex_lock (&thr->ts.work_share->lock); ret = gomp_iter_dynamic_next_locked (istart, iend); gomp_mutex_unlock (&thr->ts.work_share->lock); #endif @@ -106,13 +146,16 @@ gomp_loop_guided_start (long start, long end, long incr, long chunk_size, bool ret; if (gomp_work_share_start (false)) - gomp_loop_init (thr->ts.work_share, start, end, incr, - GFS_GUIDED, chunk_size); + { + gomp_loop_init (thr->ts.work_share, start, end, incr, + GFS_GUIDED, chunk_size); + gomp_work_share_init_done (); + } #ifdef HAVE_SYNC_BUILTINS - gomp_mutex_unlock (&thr->ts.work_share->lock); ret = gomp_iter_guided_next (istart, iend); #else + gomp_mutex_lock (&thr->ts.work_share->lock); ret = gomp_iter_guided_next_locked (istart, iend); gomp_mutex_unlock (&thr->ts.work_share->lock); #endif @@ -124,17 +167,22 @@ bool GOMP_loop_runtime_start (long start, long end, long incr, long *istart, long *iend) { - switch (gomp_run_sched_var) + struct gomp_task_icv *icv = gomp_icv (false); + switch (icv->run_sched_var) { case GFS_STATIC: - return gomp_loop_static_start (start, end, incr, gomp_run_sched_chunk, + return gomp_loop_static_start (start, end, incr, icv->run_sched_modifier, istart, iend); case GFS_DYNAMIC: - return gomp_loop_dynamic_start (start, end, incr, gomp_run_sched_chunk, + return gomp_loop_dynamic_start (start, end, incr, icv->run_sched_modifier, istart, iend); case GFS_GUIDED: - return gomp_loop_guided_start (start, end, incr, gomp_run_sched_chunk, + return gomp_loop_guided_start (start, end, incr, icv->run_sched_modifier, istart, iend); + case GFS_AUTO: + /* For now map to schedule(static), later on we could play with feedback + driven choice. */ + return gomp_loop_static_start (start, end, incr, 0, istart, iend); default: abort (); } @@ -149,13 +197,14 @@ gomp_loop_ordered_static_start (long start, long end, long incr, { struct gomp_thread *thr = gomp_thread (); + thr->ts.static_trip = 0; if (gomp_work_share_start (true)) { gomp_loop_init (thr->ts.work_share, start, end, incr, GFS_STATIC, chunk_size); gomp_ordered_static_init (); + gomp_work_share_init_done (); } - gomp_mutex_unlock (&thr->ts.work_share->lock); return !gomp_iter_static_next (istart, iend); } @@ -168,8 +217,14 @@ gomp_loop_ordered_dynamic_start (long start, long end, long incr, bool ret; if (gomp_work_share_start (true)) - gomp_loop_init (thr->ts.work_share, start, end, incr, - GFS_DYNAMIC, chunk_size); + { + gomp_loop_init (thr->ts.work_share, start, end, incr, + GFS_DYNAMIC, chunk_size); + gomp_mutex_lock (&thr->ts.work_share->lock); + gomp_work_share_init_done (); + } + else + gomp_mutex_lock (&thr->ts.work_share->lock); ret = gomp_iter_dynamic_next_locked (istart, iend); if (ret) @@ -187,8 +242,14 @@ gomp_loop_ordered_guided_start (long start, long end, long incr, bool ret; if (gomp_work_share_start (true)) - gomp_loop_init (thr->ts.work_share, start, end, incr, - GFS_GUIDED, chunk_size); + { + gomp_loop_init (thr->ts.work_share, start, end, incr, + GFS_GUIDED, chunk_size); + gomp_mutex_lock (&thr->ts.work_share->lock); + gomp_work_share_init_done (); + } + else + gomp_mutex_lock (&thr->ts.work_share->lock); ret = gomp_iter_guided_next_locked (istart, iend); if (ret) @@ -202,20 +263,26 @@ bool GOMP_loop_ordered_runtime_start (long start, long end, long incr, long *istart, long *iend) { - switch (gomp_run_sched_var) + struct gomp_task_icv *icv = gomp_icv (false); + switch (icv->run_sched_var) { case GFS_STATIC: return gomp_loop_ordered_static_start (start, end, incr, - gomp_run_sched_chunk, + icv->run_sched_modifier, istart, iend); case GFS_DYNAMIC: return gomp_loop_ordered_dynamic_start (start, end, incr, - gomp_run_sched_chunk, + icv->run_sched_modifier, istart, iend); case GFS_GUIDED: return gomp_loop_ordered_guided_start (start, end, incr, - gomp_run_sched_chunk, + icv->run_sched_modifier, istart, iend); + case GFS_AUTO: + /* For now map to schedule(static), later on we could play with feedback + driven choice. */ + return gomp_loop_ordered_static_start (start, end, incr, + 0, istart, iend); default: abort (); } @@ -279,6 +346,7 @@ GOMP_loop_runtime_next (long *istart, long *iend) switch (thr->ts.work_share->sched) { case GFS_STATIC: + case GFS_AUTO: return gomp_loop_static_next (istart, iend); case GFS_DYNAMIC: return gomp_loop_dynamic_next (istart, iend); @@ -356,6 +424,7 @@ GOMP_loop_ordered_runtime_next (long *istart, long *iend) switch (thr->ts.work_share->sched) { case GFS_STATIC: + case GFS_AUTO: return gomp_loop_ordered_static_next (istart, iend); case GFS_DYNAMIC: return gomp_loop_ordered_dynamic_next (istart, iend); @@ -375,12 +444,12 @@ gomp_parallel_loop_start (void (*fn) (void *), void *data, long incr, enum gomp_schedule_type sched, long chunk_size) { - struct gomp_work_share *ws; + struct gomp_team *team; - num_threads = gomp_resolve_num_threads (num_threads); - ws = gomp_new_work_share (false, num_threads); - gomp_loop_init (ws, start, end, incr, sched, chunk_size); - gomp_team_start (fn, data, num_threads, ws); + num_threads = gomp_resolve_num_threads (num_threads, 0); + team = gomp_new_team (num_threads); + gomp_loop_init (&team->work_shares[0], start, end, incr, sched, chunk_size); + gomp_team_start (fn, data, num_threads, team); } void @@ -415,8 +484,9 @@ GOMP_parallel_loop_runtime_start (void (*fn) (void *), void *data, unsigned num_threads, long start, long end, long incr) { + struct gomp_task_icv *icv = gomp_icv (false); gomp_parallel_loop_start (fn, data, num_threads, start, end, incr, - gomp_run_sched_var, gomp_run_sched_chunk); + icv->run_sched_var, icv->run_sched_modifier); } /* The GOMP_loop_end* routines are called after the thread is told that diff --git a/libgomp/loop_ull.c b/libgomp/loop_ull.c new file mode 100644 index 00000000000..7dab05326f9 --- /dev/null +++ b/libgomp/loop_ull.c @@ -0,0 +1,565 @@ +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. + Contributed by Richard Henderson <rth@redhat.com>. + + This file is part of the GNU OpenMP Library (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU Lesser General Public License as published by + the Free Software Foundation; either version 2.1 of the License, or + (at your option) any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for + more details. + + You should have received a copy of the GNU Lesser General Public License + along with libgomp; see the file COPYING.LIB. If not, write to the + Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, + MA 02110-1301, USA. */ + +/* As a special exception, if you link this library with other files, some + of which are compiled with GCC, to produce an executable, this library + does not by itself cause the resulting executable to be covered by the + GNU General Public License. This exception does not however invalidate + any other reasons why the executable file might be covered by the GNU + General Public License. */ + +/* This file handles the LOOP (FOR/DO) construct. */ + +#include <limits.h> +#include <stdlib.h> +#include "libgomp.h" + +typedef unsigned long long gomp_ull; + +/* Initialize the given work share construct from the given arguments. */ + +static inline void +gomp_loop_ull_init (struct gomp_work_share *ws, bool up, gomp_ull start, + gomp_ull end, gomp_ull incr, enum gomp_schedule_type sched, + gomp_ull chunk_size) +{ + ws->sched = sched; + ws->chunk_size_ull = chunk_size; + /* Canonicalize loops that have zero iterations to ->next == ->end. */ + ws->end_ull = ((up && start > end) || (!up && start < end)) + ? start : end; + ws->incr_ull = incr; + ws->next_ull = start; + ws->mode = 0; + if (sched == GFS_DYNAMIC) + { + ws->chunk_size_ull *= incr; + +#if defined HAVE_SYNC_BUILTINS && defined __LP64__ + { + /* For dynamic scheduling prepare things to make each iteration + faster. */ + struct gomp_thread *thr = gomp_thread (); + struct gomp_team *team = thr->ts.team; + long nthreads = team ? team->nthreads : 1; + + if (__builtin_expect (up, 1)) + { + /* Cheap overflow protection. */ + if (__builtin_expect ((nthreads | ws->chunk_size_ull) + < 1ULL << (sizeof (gomp_ull) + * __CHAR_BIT__ / 2 - 1), 1)) + ws->mode = ws->end_ull < (__LONG_LONG_MAX__ * 2ULL + 1 + - (nthreads + 1) * ws->chunk_size_ull); + } + /* Cheap overflow protection. */ + else if (__builtin_expect ((nthreads | -ws->chunk_size_ull) + < 1ULL << (sizeof (gomp_ull) + * __CHAR_BIT__ / 2 - 1), 1)) + ws->mode = ws->end_ull > ((nthreads + 1) * -ws->chunk_size_ull + - (__LONG_LONG_MAX__ * 2ULL + 1)); + } +#endif + } + if (!up) + ws->mode |= 2; +} + +/* The *_start routines are called when first encountering a loop construct + that is not bound directly to a parallel construct. The first thread + that arrives will create the work-share construct; subsequent threads + will see the construct exists and allocate work from it. + + START, END, INCR are the bounds of the loop; due to the restrictions of + OpenMP, these values must be the same in every thread. This is not + verified (nor is it entirely verifiable, since START is not necessarily + retained intact in the work-share data structure). CHUNK_SIZE is the + scheduling parameter; again this must be identical in all threads. + + Returns true if there's any work for this thread to perform. If so, + *ISTART and *IEND are filled with the bounds of the iteration block + allocated to this thread. Returns false if all work was assigned to + other threads prior to this thread's arrival. */ + +static bool +gomp_loop_ull_static_start (bool up, gomp_ull start, gomp_ull end, + gomp_ull incr, gomp_ull chunk_size, + gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + + thr->ts.static_trip = 0; + if (gomp_work_share_start (false)) + { + gomp_loop_ull_init (thr->ts.work_share, up, start, end, incr, + GFS_STATIC, chunk_size); + gomp_work_share_init_done (); + } + + return !gomp_iter_ull_static_next (istart, iend); +} + +static bool +gomp_loop_ull_dynamic_start (bool up, gomp_ull start, gomp_ull end, + gomp_ull incr, gomp_ull chunk_size, + gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + bool ret; + + if (gomp_work_share_start (false)) + { + gomp_loop_ull_init (thr->ts.work_share, up, start, end, incr, + GFS_DYNAMIC, chunk_size); + gomp_work_share_init_done (); + } + +#if defined HAVE_SYNC_BUILTINS && defined __LP64__ + ret = gomp_iter_ull_dynamic_next (istart, iend); +#else + gomp_mutex_lock (&thr->ts.work_share->lock); + ret = gomp_iter_ull_dynamic_next_locked (istart, iend); + gomp_mutex_unlock (&thr->ts.work_share->lock); +#endif + + return ret; +} + +static bool +gomp_loop_ull_guided_start (bool up, gomp_ull start, gomp_ull end, + gomp_ull incr, gomp_ull chunk_size, + gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + bool ret; + + if (gomp_work_share_start (false)) + { + gomp_loop_ull_init (thr->ts.work_share, up, start, end, incr, + GFS_GUIDED, chunk_size); + gomp_work_share_init_done (); + } + +#if defined HAVE_SYNC_BUILTINS && defined __LP64__ + ret = gomp_iter_ull_guided_next (istart, iend); +#else + gomp_mutex_lock (&thr->ts.work_share->lock); + ret = gomp_iter_ull_guided_next_locked (istart, iend); + gomp_mutex_unlock (&thr->ts.work_share->lock); +#endif + + return ret; +} + +bool +GOMP_loop_ull_runtime_start (bool up, gomp_ull start, gomp_ull end, + gomp_ull incr, gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_task_icv *icv = gomp_icv (false); + switch (icv->run_sched_var) + { + case GFS_STATIC: + return gomp_loop_ull_static_start (up, start, end, incr, + icv->run_sched_modifier, + istart, iend); + case GFS_DYNAMIC: + return gomp_loop_ull_dynamic_start (up, start, end, incr, + icv->run_sched_modifier, + istart, iend); + case GFS_GUIDED: + return gomp_loop_ull_guided_start (up, start, end, incr, + icv->run_sched_modifier, + istart, iend); + case GFS_AUTO: + /* For now map to schedule(static), later on we could play with feedback + driven choice. */ + return gomp_loop_ull_static_start (up, start, end, incr, + 0, istart, iend); + default: + abort (); + } +} + +/* The *_ordered_*_start routines are similar. The only difference is that + this work-share construct is initialized to expect an ORDERED section. */ + +static bool +gomp_loop_ull_ordered_static_start (bool up, gomp_ull start, gomp_ull end, + gomp_ull incr, gomp_ull chunk_size, + gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + + thr->ts.static_trip = 0; + if (gomp_work_share_start (true)) + { + gomp_loop_ull_init (thr->ts.work_share, up, start, end, incr, + GFS_STATIC, chunk_size); + gomp_ordered_static_init (); + gomp_work_share_init_done (); + } + + return !gomp_iter_ull_static_next (istart, iend); +} + +static bool +gomp_loop_ull_ordered_dynamic_start (bool up, gomp_ull start, gomp_ull end, + gomp_ull incr, gomp_ull chunk_size, + gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + bool ret; + + if (gomp_work_share_start (true)) + { + gomp_loop_ull_init (thr->ts.work_share, up, start, end, incr, + GFS_DYNAMIC, chunk_size); + gomp_mutex_lock (&thr->ts.work_share->lock); + gomp_work_share_init_done (); + } + else + gomp_mutex_lock (&thr->ts.work_share->lock); + + ret = gomp_iter_ull_dynamic_next_locked (istart, iend); + if (ret) + gomp_ordered_first (); + gomp_mutex_unlock (&thr->ts.work_share->lock); + + return ret; +} + +static bool +gomp_loop_ull_ordered_guided_start (bool up, gomp_ull start, gomp_ull end, + gomp_ull incr, gomp_ull chunk_size, + gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + bool ret; + + if (gomp_work_share_start (true)) + { + gomp_loop_ull_init (thr->ts.work_share, up, start, end, incr, + GFS_GUIDED, chunk_size); + gomp_mutex_lock (&thr->ts.work_share->lock); + gomp_work_share_init_done (); + } + else + gomp_mutex_lock (&thr->ts.work_share->lock); + + ret = gomp_iter_ull_guided_next_locked (istart, iend); + if (ret) + gomp_ordered_first (); + gomp_mutex_unlock (&thr->ts.work_share->lock); + + return ret; +} + +bool +GOMP_loop_ull_ordered_runtime_start (bool up, gomp_ull start, gomp_ull end, + gomp_ull incr, gomp_ull *istart, + gomp_ull *iend) +{ + struct gomp_task_icv *icv = gomp_icv (false); + switch (icv->run_sched_var) + { + case GFS_STATIC: + return gomp_loop_ull_ordered_static_start (up, start, end, incr, + icv->run_sched_modifier, + istart, iend); + case GFS_DYNAMIC: + return gomp_loop_ull_ordered_dynamic_start (up, start, end, incr, + icv->run_sched_modifier, + istart, iend); + case GFS_GUIDED: + return gomp_loop_ull_ordered_guided_start (up, start, end, incr, + icv->run_sched_modifier, + istart, iend); + case GFS_AUTO: + /* For now map to schedule(static), later on we could play with feedback + driven choice. */ + return gomp_loop_ull_ordered_static_start (up, start, end, incr, + 0, istart, iend); + default: + abort (); + } +} + +/* The *_next routines are called when the thread completes processing of + the iteration block currently assigned to it. If the work-share + construct is bound directly to a parallel construct, then the iteration + bounds may have been set up before the parallel. In which case, this + may be the first iteration for the thread. + + Returns true if there is work remaining to be performed; *ISTART and + *IEND are filled with a new iteration block. Returns false if all work + has been assigned. */ + +static bool +gomp_loop_ull_static_next (gomp_ull *istart, gomp_ull *iend) +{ + return !gomp_iter_ull_static_next (istart, iend); +} + +static bool +gomp_loop_ull_dynamic_next (gomp_ull *istart, gomp_ull *iend) +{ + bool ret; + +#if defined HAVE_SYNC_BUILTINS && defined __LP64__ + ret = gomp_iter_ull_dynamic_next (istart, iend); +#else + struct gomp_thread *thr = gomp_thread (); + gomp_mutex_lock (&thr->ts.work_share->lock); + ret = gomp_iter_ull_dynamic_next_locked (istart, iend); + gomp_mutex_unlock (&thr->ts.work_share->lock); +#endif + + return ret; +} + +static bool +gomp_loop_ull_guided_next (gomp_ull *istart, gomp_ull *iend) +{ + bool ret; + +#if defined HAVE_SYNC_BUILTINS && defined __LP64__ + ret = gomp_iter_ull_guided_next (istart, iend); +#else + struct gomp_thread *thr = gomp_thread (); + gomp_mutex_lock (&thr->ts.work_share->lock); + ret = gomp_iter_ull_guided_next_locked (istart, iend); + gomp_mutex_unlock (&thr->ts.work_share->lock); +#endif + + return ret; +} + +bool +GOMP_loop_ull_runtime_next (gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + + switch (thr->ts.work_share->sched) + { + case GFS_STATIC: + case GFS_AUTO: + return gomp_loop_ull_static_next (istart, iend); + case GFS_DYNAMIC: + return gomp_loop_ull_dynamic_next (istart, iend); + case GFS_GUIDED: + return gomp_loop_ull_guided_next (istart, iend); + default: + abort (); + } +} + +/* The *_ordered_*_next routines are called when the thread completes + processing of the iteration block currently assigned to it. + + Returns true if there is work remaining to be performed; *ISTART and + *IEND are filled with a new iteration block. Returns false if all work + has been assigned. */ + +static bool +gomp_loop_ull_ordered_static_next (gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + int test; + + gomp_ordered_sync (); + gomp_mutex_lock (&thr->ts.work_share->lock); + test = gomp_iter_ull_static_next (istart, iend); + if (test >= 0) + gomp_ordered_static_next (); + gomp_mutex_unlock (&thr->ts.work_share->lock); + + return test == 0; +} + +static bool +gomp_loop_ull_ordered_dynamic_next (gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + bool ret; + + gomp_ordered_sync (); + gomp_mutex_lock (&thr->ts.work_share->lock); + ret = gomp_iter_ull_dynamic_next_locked (istart, iend); + if (ret) + gomp_ordered_next (); + else + gomp_ordered_last (); + gomp_mutex_unlock (&thr->ts.work_share->lock); + + return ret; +} + +static bool +gomp_loop_ull_ordered_guided_next (gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + bool ret; + + gomp_ordered_sync (); + gomp_mutex_lock (&thr->ts.work_share->lock); + ret = gomp_iter_ull_guided_next_locked (istart, iend); + if (ret) + gomp_ordered_next (); + else + gomp_ordered_last (); + gomp_mutex_unlock (&thr->ts.work_share->lock); + + return ret; +} + +bool +GOMP_loop_ull_ordered_runtime_next (gomp_ull *istart, gomp_ull *iend) +{ + struct gomp_thread *thr = gomp_thread (); + + switch (thr->ts.work_share->sched) + { + case GFS_STATIC: + case GFS_AUTO: + return gomp_loop_ull_ordered_static_next (istart, iend); + case GFS_DYNAMIC: + return gomp_loop_ull_ordered_dynamic_next (istart, iend); + case GFS_GUIDED: + return gomp_loop_ull_ordered_guided_next (istart, iend); + default: + abort (); + } +} + +/* We use static functions above so that we're sure that the "runtime" + function can defer to the proper routine without interposition. We + export the static function with a strong alias when possible, or with + a wrapper function otherwise. */ + +#ifdef HAVE_ATTRIBUTE_ALIAS +extern __typeof(gomp_loop_ull_static_start) GOMP_loop_ull_static_start + __attribute__((alias ("gomp_loop_ull_static_start"))); +extern __typeof(gomp_loop_ull_dynamic_start) GOMP_loop_ull_dynamic_start + __attribute__((alias ("gomp_loop_ull_dynamic_start"))); +extern __typeof(gomp_loop_ull_guided_start) GOMP_loop_ull_guided_start + __attribute__((alias ("gomp_loop_ull_guided_start"))); + +extern __typeof(gomp_loop_ull_ordered_static_start) GOMP_loop_ull_ordered_static_start + __attribute__((alias ("gomp_loop_ull_ordered_static_start"))); +extern __typeof(gomp_loop_ull_ordered_dynamic_start) GOMP_loop_ull_ordered_dynamic_start + __attribute__((alias ("gomp_loop_ull_ordered_dynamic_start"))); +extern __typeof(gomp_loop_ull_ordered_guided_start) GOMP_loop_ull_ordered_guided_start + __attribute__((alias ("gomp_loop_ull_ordered_guided_start"))); + +extern __typeof(gomp_loop_ull_static_next) GOMP_loop_ull_static_next + __attribute__((alias ("gomp_loop_ull_static_next"))); +extern __typeof(gomp_loop_ull_dynamic_next) GOMP_loop_ull_dynamic_next + __attribute__((alias ("gomp_loop_ull_dynamic_next"))); +extern __typeof(gomp_loop_ull_guided_next) GOMP_loop_ull_guided_next + __attribute__((alias ("gomp_loop_ull_guided_next"))); + +extern __typeof(gomp_loop_ull_ordered_static_next) GOMP_loop_ull_ordered_static_next + __attribute__((alias ("gomp_loop_ull_ordered_static_next"))); +extern __typeof(gomp_loop_ull_ordered_dynamic_next) GOMP_loop_ull_ordered_dynamic_next + __attribute__((alias ("gomp_loop_ull_ordered_dynamic_next"))); +extern __typeof(gomp_loop_ull_ordered_guided_next) GOMP_loop_ull_ordered_guided_next + __attribute__((alias ("gomp_loop_ull_ordered_guided_next"))); +#else +bool +GOMP_loop_ull_static_start (gomp_ull start, gomp_ull end, gomp_ull incr, gomp_ull chunk_size, + gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_static_start (start, end, incr, chunk_size, istart, iend); +} + +bool +GOMP_loop_ull_dynamic_start (gomp_ull start, gomp_ull end, gomp_ull incr, gomp_ull chunk_size, + gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_dynamic_start (start, end, incr, chunk_size, istart, iend); +} + +bool +GOMP_loop_ull_guided_start (gomp_ull start, gomp_ull end, gomp_ull incr, gomp_ull chunk_size, + gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_guided_start (start, end, incr, chunk_size, istart, iend); +} + +bool +GOMP_loop_ull_ordered_static_start (gomp_ull start, gomp_ull end, gomp_ull incr, + gomp_ull chunk_size, gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_ordered_static_start (start, end, incr, chunk_size, + istart, iend); +} + +bool +GOMP_loop_ull_ordered_dynamic_start (gomp_ull start, gomp_ull end, gomp_ull incr, + gomp_ull chunk_size, gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_ordered_dynamic_start (start, end, incr, chunk_size, + istart, iend); +} + +bool +GOMP_loop_ull_ordered_guided_start (gomp_ull start, gomp_ull end, gomp_ull incr, + gomp_ull chunk_size, gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_ordered_guided_start (start, end, incr, chunk_size, + istart, iend); +} + +bool +GOMP_loop_ull_static_next (gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_static_next (istart, iend); +} + +bool +GOMP_loop_ull_dynamic_next (gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_dynamic_next (istart, iend); +} + +bool +GOMP_loop_ull_guided_next (gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_guided_next (istart, iend); +} + +bool +GOMP_loop_ull_ordered_static_next (gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_ordered_static_next (istart, iend); +} + +bool +GOMP_loop_ull_ordered_dynamic_next (gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_ordered_dynamic_next (istart, iend); +} + +bool +GOMP_loop_ull_ordered_guided_next (gomp_ull *istart, gomp_ull *iend) +{ + return gomp_loop_ull_ordered_guided_next (istart, iend); +} +#endif diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in index 5ebcdbb2735..d4fe94a2ca7 100644 --- a/libgomp/omp.h.in +++ b/libgomp/omp.h.in @@ -1,4 +1,4 @@ -/* Copyright (C) 2005, 2007 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2007, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -47,6 +47,14 @@ typedef struct } omp_nest_lock_t; #endif +typedef enum omp_sched_t +{ + omp_sched_static = 1, + omp_sched_dynamic = 2, + omp_sched_guided = 3, + omp_sched_auto = 4 +} omp_sched_t; + #ifdef __cplusplus extern "C" { # define __GOMP_NOTHROW throw () @@ -83,6 +91,16 @@ extern int omp_test_nest_lock (omp_nest_lock_t *) __GOMP_NOTHROW; extern double omp_get_wtime (void) __GOMP_NOTHROW; extern double omp_get_wtick (void) __GOMP_NOTHROW; +void omp_set_schedule (omp_sched_t, int) __GOMP_NOTHROW; +void omp_get_schedule (omp_sched_t *, int *) __GOMP_NOTHROW; +int omp_get_thread_limit (void) __GOMP_NOTHROW; +void omp_set_max_active_levels (int) __GOMP_NOTHROW; +int omp_get_max_active_levels (void) __GOMP_NOTHROW; +int omp_get_level (void) __GOMP_NOTHROW; +int omp_get_ancestor_thread_num (int) __GOMP_NOTHROW; +int omp_get_team_size (int) __GOMP_NOTHROW; +int omp_get_active_level (void) __GOMP_NOTHROW; + #ifdef __cplusplus } #endif diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in index 4b8553b3236..a31a94567ea 100644 --- a/libgomp/omp_lib.f90.in +++ b/libgomp/omp_lib.f90.in @@ -1,4 +1,4 @@ -! Copyright (C) 2005 Free Software Foundation, Inc. +! Copyright (C) 2005, 2007, 2008 Free Software Foundation, Inc. ! Contributed by Jakub Jelinek <jakub@redhat.com>. ! This file is part of the GNU OpenMP Library (libgomp). @@ -30,11 +30,16 @@ integer, parameter :: omp_logical_kind = 4 integer, parameter :: omp_lock_kind = @OMP_LOCK_KIND@ integer, parameter :: omp_nest_lock_kind = @OMP_NEST_LOCK_KIND@ + integer, parameter :: omp_sched_kind = 4 end module module omp_lib use omp_lib_kinds - integer, parameter :: openmp_version = 200505 + integer, parameter :: openmp_version = 200805 + integer (omp_sched_kind), parameter :: omp_sched_static = 1 + integer (omp_sched_kind), parameter :: omp_sched_dynamic = 2 + integer (omp_sched_kind), parameter :: omp_sched_guided = 3 + integer (omp_sched_kind), parameter :: omp_sched_auto = 4 interface subroutine omp_init_lock (lock) @@ -196,4 +201,95 @@ end function omp_get_wtime end interface + interface omp_set_schedule + subroutine omp_set_schedule (kind, modifier) + use omp_lib_kinds + integer (omp_sched_kind), intent (in) :: kind + integer (4), intent (in) :: modifier + end subroutine omp_set_schedule + subroutine omp_set_schedule_8 (kind, modifier) + use omp_lib_kinds + integer (omp_sched_kind), intent (in) :: kind + integer (8), intent (in) :: modifier + end subroutine omp_set_schedule_8 + end interface + + interface omp_get_schedule + subroutine omp_get_schedule (kind, modifier) + use omp_lib_kinds + integer (omp_sched_kind), intent (out) :: kind + integer (4), intent (out) :: modifier + end subroutine omp_get_schedule + subroutine omp_get_schedule_8 (kind, modifier) + use omp_lib_kinds + integer (omp_sched_kind), intent (out) :: kind + integer (8), intent (out) :: modifier + end subroutine omp_get_schedule_8 + end interface + + interface + function omp_get_thread_limit () + use omp_lib_kinds + integer (omp_integer_kind) :: omp_get_thread_limit + end function omp_get_thread_limit + end interface + + interface omp_set_max_active_levels + subroutine omp_set_max_active_levels (max_levels) + use omp_lib_kinds + integer (4), intent (in) :: max_levels + end subroutine omp_set_max_active_levels + subroutine omp_set_max_active_levels_8 (max_levels) + use omp_lib_kinds + integer (8), intent (in) :: max_levels + end subroutine omp_set_max_active_levels_8 + end interface + + interface + function omp_get_max_active_levels () + use omp_lib_kinds + integer (omp_integer_kind) :: omp_get_max_active_levels + end function omp_get_max_active_levels + end interface + + interface + function omp_get_level () + use omp_lib_kinds + integer (omp_integer_kind) :: omp_get_level + end function omp_get_level + end interface + + interface omp_get_ancestor_thread_num + function omp_get_ancestor_thread_num (level) + use omp_lib_kinds + integer (4), intent (in) :: level + integer (omp_integer_kind) :: omp_get_ancestor_thread_num + end function omp_get_ancestor_thread_num + function omp_get_ancestor_thread_num_8 (level) + use omp_lib_kinds + integer (8), intent (in) :: level + integer (omp_integer_kind) :: omp_get_ancestor_thread_num + end function omp_get_ancestor_thread_num_8 + end interface + + interface omp_get_team_size + function omp_get_team_size (level) + use omp_lib_kinds + integer (4), intent (in) :: level + integer (omp_integer_kind) :: omp_get_team_size + end function omp_get_team_size + function omp_get_team_size_8 (level) + use omp_lib_kinds + integer (8), intent (in) :: level + integer (omp_integer_kind) :: omp_get_team_size + end function omp_get_team_size_8 + end interface + + interface + function omp_get_active_level () + use omp_lib_kinds + integer (omp_integer_kind) :: omp_get_active_level + end function omp_get_active_level + end interface + end module omp_lib diff --git a/libgomp/omp_lib.h.in b/libgomp/omp_lib.h.in index 734f2f781fc..60677f666a1 100644 --- a/libgomp/omp_lib.h.in +++ b/libgomp/omp_lib.h.in @@ -1,4 +1,4 @@ -! Copyright (C) 2005 Free Software Foundation, Inc. +! Copyright (C) 2005, 2007, 2008 Free Software Foundation, Inc. ! Contributed by Jakub Jelinek <jakub@redhat.com>. ! This file is part of the GNU OpenMP Library (libgomp). @@ -26,9 +26,16 @@ ! General Public License. integer omp_lock_kind, omp_nest_lock_kind, openmp_version + integer omp_sched_kind, omp_sched_static, omp_sched_dynamic + integer omp_sched_guided, omp_sched_auto parameter (omp_lock_kind = @OMP_LOCK_KIND@) parameter (omp_nest_lock_kind = @OMP_NEST_LOCK_KIND@) - parameter (openmp_version = 200505) + parameter (omp_sched_kind = 4) + parameter (omp_sched_static = 1) + parameter (omp_sched_dynamic = 2) + parameter (omp_sched_guided = 3) + parameter (omp_sched_auto = 4) + parameter (openmp_version = 200805) external omp_init_lock, omp_init_nest_lock external omp_destroy_lock, omp_destroy_nest_lock @@ -51,3 +58,12 @@ external omp_get_wtick, omp_get_wtime double precision omp_get_wtick, omp_get_wtime + + external omp_set_schedule, omp_get_schedule + external omp_get_thread_limit, omp_set_max_active_levels + external omp_get_max_active_levels, omp_get_level + external omp_get_ancestor_thread_num, omp_get_team_size + external omp_get_active_level + integer*4 omp_get_thread_limit, omp_get_max_active_levels + integer*4 omp_get_level, omp_get_ancestor_thread_num + integer*4 omp_get_team_size, omp_get_active_level diff --git a/libgomp/parallel.c b/libgomp/parallel.c index edd344a90a8..3f2a3056138 100644 --- a/libgomp/parallel.c +++ b/libgomp/parallel.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2007, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -28,52 +28,107 @@ /* This file handles the (bare) PARALLEL construct. */ #include "libgomp.h" +#include <limits.h> /* Determine the number of threads to be launched for a PARALLEL construct. - This algorithm is explicitly described in OpenMP 2.5 section 2.4.1. + This algorithm is explicitly described in OpenMP 3.0 section 2.4.1. SPECIFIED is a combination of the NUM_THREADS clause and the IF clause. If the IF clause is false, SPECIFIED is forced to 1. When NUM_THREADS is not present, SPECIFIED is 0. */ unsigned -gomp_resolve_num_threads (unsigned specified) +gomp_resolve_num_threads (unsigned specified, unsigned count) { - /* Early exit for false IF condition or degenerate NUM_THREADS. */ + struct gomp_thread *thread = gomp_thread(); + struct gomp_task_icv *icv; + unsigned threads_requested, max_num_threads, num_threads; + unsigned long remaining; + + icv = gomp_icv (false); + if (specified == 1) return 1; - - /* If this is a nested region, and nested regions are disabled, force - this team to use only one thread. */ - if (gomp_thread()->ts.team && !gomp_nest_var) + else if (thread->ts.active_level >= 1 && !icv->nest_var) + return 1; + else if (thread->ts.active_level >= gomp_max_active_levels_var) return 1; /* If NUM_THREADS not specified, use nthreads_var. */ if (specified == 0) - specified = gomp_nthreads_var; + threads_requested = icv->nthreads_var; + else + threads_requested = specified; + + max_num_threads = threads_requested; /* If dynamic threads are enabled, bound the number of threads that we launch. */ - if (gomp_dyn_var) + if (icv->dyn_var) { unsigned dyn = gomp_dynamic_max_threads (); - if (dyn < specified) - return dyn; + if (dyn < max_num_threads) + max_num_threads = dyn; + + /* Optimization for parallel sections. */ + if (count && count < max_num_threads) + max_num_threads = count; } - return specified; + /* ULONG_MAX stands for infinity. */ + if (__builtin_expect (gomp_thread_limit_var == ULONG_MAX, 1) + || max_num_threads == 1) + return max_num_threads; + +#ifdef HAVE_SYNC_BUILTINS + do + { + remaining = gomp_remaining_threads_count; + num_threads = max_num_threads; + if (num_threads > remaining) + num_threads = remaining + 1; + } + while (__sync_val_compare_and_swap (&gomp_remaining_threads_count, + remaining, remaining - num_threads + 1) + != remaining); +#else + gomp_mutex_lock (&gomp_remaining_threads_lock); + num_threads = max_num_threads; + remaining = gomp_remaining_threads_count; + if (num_threads > remaining) + num_threads = remaining + 1; + gomp_remaining_threads_count -= num_threads - 1; + gomp_mutex_unlock (&gomp_remaining_threads_lock); +#endif + + return num_threads; } void GOMP_parallel_start (void (*fn) (void *), void *data, unsigned num_threads) { - num_threads = gomp_resolve_num_threads (num_threads); - gomp_team_start (fn, data, num_threads, NULL); + num_threads = gomp_resolve_num_threads (num_threads, 0); + gomp_team_start (fn, data, num_threads, gomp_new_team (num_threads)); } void GOMP_parallel_end (void) { + if (__builtin_expect (gomp_thread_limit_var != ULONG_MAX, 0)) + { + struct gomp_thread *thr = gomp_thread (); + struct gomp_team *team = thr->ts.team; + if (team && team->nthreads > 1) + { +#ifdef HAVE_SYNC_BUILTINS + __sync_fetch_and_add (&gomp_remaining_threads_count, + 1UL - team->nthreads); +#else + gomp_mutex_lock (&gomp_remaining_threads_lock); + gomp_remaining_threads_count -= team->nthreads - 1; +#endif + } + } gomp_team_end (); } @@ -87,40 +142,63 @@ omp_get_num_threads (void) return team ? team->nthreads : 1; } -/* ??? Does this function need to disregard dyn_var? I don't see - how else one could get a useable "maximum". */ - int -omp_get_max_threads (void) +omp_get_thread_num (void) { - return gomp_resolve_num_threads (0); + return gomp_thread ()->ts.team_id; } +/* This wasn't right for OpenMP 2.5. Active region used to be non-zero + when the IF clause doesn't evaluate to false, starting with OpenMP 3.0 + it is non-zero with more than one thread in the team. */ + int -omp_get_thread_num (void) +omp_in_parallel (void) { - return gomp_thread ()->ts.team_id; + return gomp_thread ()->ts.active_level > 0; } -/* ??? This isn't right. The definition of this function is false if any - of the IF clauses for any of the parallels is false. Which is not the - same thing as any outer team having more than one thread. */ +int +omp_get_level (void) +{ + return gomp_thread ()->ts.level; +} -int omp_in_parallel (void) +int +omp_get_ancestor_thread_num (int level) { - struct gomp_team *team = gomp_thread ()->ts.team; + struct gomp_team_state *ts = &gomp_thread ()->ts; + if (level < 0 || level > ts->level) + return -1; + for (level = ts->level - level; level > 0; --level) + ts = &ts->team->prev_ts; + return ts->team_id; +} - while (team) - { - if (team->nthreads > 1) - return true; - team = team->prev_ts.team; - } +int +omp_get_team_size (int level) +{ + struct gomp_team_state *ts = &gomp_thread ()->ts; + if (level < 0 || level > ts->level) + return -1; + for (level = ts->level - level; level > 0; --level) + ts = &ts->team->prev_ts; + if (ts->team == NULL) + return 1; + else + return ts->team->nthreads; +} - return false; +int +omp_get_active_level (void) +{ + return gomp_thread ()->ts.active_level; } ialias (omp_get_num_threads) -ialias (omp_get_max_threads) ialias (omp_get_thread_num) ialias (omp_in_parallel) +ialias (omp_get_level) +ialias (omp_get_ancestor_thread_num) +ialias (omp_get_team_size) +ialias (omp_get_active_level) diff --git a/libgomp/sections.c b/libgomp/sections.c index 9ccc65e4b66..27625efec3e 100644 --- a/libgomp/sections.c +++ b/libgomp/sections.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005, 2007 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2007, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -59,14 +59,24 @@ GOMP_sections_start (unsigned count) long s, e, ret; if (gomp_work_share_start (false)) - gomp_sections_init (thr->ts.work_share, count); + { + gomp_sections_init (thr->ts.work_share, count); + gomp_work_share_init_done (); + } +#ifdef HAVE_SYNC_BUILTINS + if (gomp_iter_dynamic_next (&s, &e)) + ret = s; + else + ret = 0; +#else + gomp_mutex_lock (&thr->ts.work_share->lock); if (gomp_iter_dynamic_next_locked (&s, &e)) ret = s; else ret = 0; - gomp_mutex_unlock (&thr->ts.work_share->lock); +#endif return ret; } @@ -83,15 +93,23 @@ GOMP_sections_start (unsigned count) unsigned GOMP_sections_next (void) { - struct gomp_thread *thr = gomp_thread (); long s, e, ret; +#ifdef HAVE_SYNC_BUILTINS + if (gomp_iter_dynamic_next (&s, &e)) + ret = s; + else + ret = 0; +#else + struct gomp_thread *thr = gomp_thread (); + gomp_mutex_lock (&thr->ts.work_share->lock); if (gomp_iter_dynamic_next_locked (&s, &e)) ret = s; else ret = 0; gomp_mutex_unlock (&thr->ts.work_share->lock); +#endif return ret; } @@ -103,15 +121,12 @@ void GOMP_parallel_sections_start (void (*fn) (void *), void *data, unsigned num_threads, unsigned count) { - struct gomp_work_share *ws; - - num_threads = gomp_resolve_num_threads (num_threads); - if (gomp_dyn_var && num_threads > count) - num_threads = count; + struct gomp_team *team; - ws = gomp_new_work_share (false, num_threads); - gomp_sections_init (ws, count); - gomp_team_start (fn, data, num_threads, ws); + num_threads = gomp_resolve_num_threads (num_threads, count); + team = gomp_new_team (num_threads); + gomp_sections_init (&team->work_shares[0], count); + gomp_team_start (fn, data, num_threads, team); } /* The GOMP_section_end* routines are called after the thread is told diff --git a/libgomp/single.c b/libgomp/single.c index dde05d9ceb8..16c7fa988a1 100644 --- a/libgomp/single.c +++ b/libgomp/single.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -37,10 +37,24 @@ bool GOMP_single_start (void) { +#ifdef HAVE_SYNC_BUILTINS + struct gomp_thread *thr = gomp_thread (); + struct gomp_team *team = thr->ts.team; + unsigned long single_count; + + if (__builtin_expect (team == NULL, 0)) + return true; + + single_count = thr->ts.single_count++; + return __sync_bool_compare_and_swap (&team->single_count, single_count, + single_count + 1L); +#else bool ret = gomp_work_share_start (false); - gomp_mutex_unlock (&gomp_thread ()->ts.work_share->lock); + if (ret) + gomp_work_share_init_done (); gomp_work_share_end_nowait (); return ret; +#endif } /* This routine is called when first encountering a SINGLE construct that @@ -57,13 +71,15 @@ GOMP_single_copy_start (void) void *ret; first = gomp_work_share_start (false); - gomp_mutex_unlock (&thr->ts.work_share->lock); if (first) - ret = NULL; + { + gomp_work_share_init_done (); + ret = NULL; + } else { - gomp_barrier_wait (&thr->ts.team->barrier); + gomp_team_barrier_wait (&thr->ts.team->barrier); ret = thr->ts.work_share->copyprivate; gomp_work_share_end_nowait (); @@ -84,7 +100,7 @@ GOMP_single_copy_end (void *data) if (team != NULL) { thr->ts.work_share->copyprivate = data; - gomp_barrier_wait (&team->barrier); + gomp_team_barrier_wait (&team->barrier); } gomp_work_share_end_nowait (); diff --git a/libgomp/task.c b/libgomp/task.c new file mode 100644 index 00000000000..903948ceca3 --- /dev/null +++ b/libgomp/task.c @@ -0,0 +1,361 @@ +/* Copyright (C) 2007, 2008 Free Software Foundation, Inc. + Contributed by Richard Henderson <rth@redhat.com>. + + This file is part of the GNU OpenMP Library (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU Lesser General Public License as published by + the Free Software Foundation; either version 2.1 of the License, or + (at your option) any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for + more details. + + You should have received a copy of the GNU Lesser General Public License + along with libgomp; see the file COPYING.LIB. If not, write to the + Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, + MA 02110-1301, USA. */ + +/* As a special exception, if you link this library with other files, some + of which are compiled with GCC, to produce an executable, this library + does not by itself cause the resulting executable to be covered by the + GNU General Public License. This exception does not however invalidate + any other reasons why the executable file might be covered by the GNU + General Public License. */ + +/* This file handles the maintainence of tasks in response to task + creation and termination. */ + +#include "libgomp.h" +#include <stdlib.h> +#include <string.h> + + +/* Create a new task data structure. */ + +void +gomp_init_task (struct gomp_task *task, struct gomp_task *parent_task, + struct gomp_task_icv *prev_icv) +{ + task->parent = parent_task; + task->icv = *prev_icv; + task->kind = GOMP_TASK_IMPLICIT; + task->in_taskwait = false; + task->children = NULL; + gomp_sem_init (&task->taskwait_sem, 0); +} + +/* Clean up a task, after completing it. */ + +void +gomp_end_task (void) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_task *task = thr->task; + + gomp_finish_task (task); + thr->task = task->parent; +} + +static inline void +gomp_clear_parent (struct gomp_task *children) +{ + struct gomp_task *task = children; + + if (task) + do + { + task->parent = NULL; + task = task->next_child; + } + while (task != children); +} + +/* Called when encountering an explicit task directive. If IF_CLAUSE is + false, then we must not delay in executing the task. If UNTIED is true, + then the task may be executed by any member of the team. */ + +void +GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *), + long arg_size, long arg_align, bool if_clause, + unsigned flags __attribute__((unused))) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_team *team = thr->ts.team; + +#ifdef HAVE_BROKEN_POSIX_SEMAPHORES + /* If pthread_mutex_* is used for omp_*lock*, then each task must be + tied to one thread all the time. This means UNTIED tasks must be + tied and if CPYFN is non-NULL IF(0) must be forced, as CPYFN + might be running on different thread than FN. */ + if (cpyfn) + if_clause = false; + if (flags & 1) + flags &= ~1; +#endif + + if (!if_clause || team == NULL + || team->task_count > 64 * team->nthreads) + { + struct gomp_task task; + + gomp_init_task (&task, thr->task, gomp_icv (false)); + task.kind = GOMP_TASK_IFFALSE; + thr->task = &task; + if (__builtin_expect (cpyfn != NULL, 0)) + { + char buf[arg_size + arg_align - 1]; + char *arg = (char *) (((uintptr_t) buf + arg_align - 1) + & ~(uintptr_t) (arg_align - 1)); + cpyfn (arg, data); + fn (arg); + } + else + fn (data); + if (task.children) + { + gomp_mutex_lock (&team->task_lock); + gomp_clear_parent (task.children); + gomp_mutex_unlock (&team->task_lock); + } + gomp_end_task (); + } + else + { + struct gomp_task *task; + struct gomp_task *parent = thr->task; + char *arg; + bool do_wake; + + task = gomp_malloc (sizeof (*task) + arg_size + arg_align - 1); + arg = (char *) (((uintptr_t) (task + 1) + arg_align - 1) + & ~(uintptr_t) (arg_align - 1)); + gomp_init_task (task, parent, gomp_icv (false)); + task->kind = GOMP_TASK_IFFALSE; + thr->task = task; + if (cpyfn) + cpyfn (arg, data); + else + memcpy (arg, data, arg_size); + thr->task = parent; + task->kind = GOMP_TASK_WAITING; + task->fn = fn; + task->fn_data = arg; + gomp_mutex_lock (&team->task_lock); + if (parent->children) + { + task->next_child = parent->children; + task->prev_child = parent->children->prev_child; + task->next_child->prev_child = task; + task->prev_child->next_child = task; + } + else + { + task->next_child = task; + task->prev_child = task; + } + parent->children = task; + if (team->task_queue) + { + task->next_queue = team->task_queue; + task->prev_queue = team->task_queue->prev_queue; + task->next_queue->prev_queue = task; + task->prev_queue->next_queue = task; + } + else + { + task->next_queue = task; + task->prev_queue = task; + team->task_queue = task; + } + if (team->task_count++ == 0) + gomp_team_barrier_set_task_pending (&team->barrier); + do_wake = team->task_running_count < team->nthreads; + gomp_mutex_unlock (&team->task_lock); + if (do_wake) + gomp_team_barrier_wake (&team->barrier, 1); + } +} + +void +gomp_barrier_handle_tasks (gomp_barrier_state_t state) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_team *team = thr->ts.team; + struct gomp_task *task = thr->task; + struct gomp_task *child_task = NULL; + struct gomp_task *to_free = NULL; + + gomp_mutex_lock (&team->task_lock); + if (gomp_barrier_last_thread (state)) + { + if (team->task_count == 0) + { + gomp_team_barrier_done (&team->barrier, state); + gomp_mutex_unlock (&team->task_lock); + gomp_team_barrier_wake (&team->barrier, 0); + return; + } + gomp_team_barrier_set_waiting_for_tasks (&team->barrier); + } + + while (1) + { + if (team->task_queue != NULL) + { + struct gomp_task *parent; + + child_task = team->task_queue; + parent = child_task->parent; + if (parent && parent->children == child_task) + parent->children = child_task->next_child; + child_task->prev_queue->next_queue = child_task->next_queue; + child_task->next_queue->prev_queue = child_task->prev_queue; + if (child_task->next_queue != child_task) + team->task_queue = child_task->next_queue; + else + team->task_queue = NULL; + child_task->kind = GOMP_TASK_TIED; + team->task_running_count++; + if (team->task_count == team->task_running_count) + gomp_team_barrier_clear_task_pending (&team->barrier); + } + gomp_mutex_unlock (&team->task_lock); + if (to_free) + { + gomp_finish_task (to_free); + free (to_free); + to_free = NULL; + } + if (child_task) + { + thr->task = child_task; + child_task->fn (child_task->fn_data); + thr->task = task; + } + else + return; + gomp_mutex_lock (&team->task_lock); + if (child_task) + { + struct gomp_task *parent = child_task->parent; + if (parent) + { + child_task->prev_child->next_child = child_task->next_child; + child_task->next_child->prev_child = child_task->prev_child; + if (parent->children == child_task) + { + if (child_task->next_child != child_task) + parent->children = child_task->next_child; + else + { + parent->children = NULL; + if (parent->in_taskwait) + gomp_sem_post (&parent->taskwait_sem); + } + } + } + gomp_clear_parent (child_task->children); + to_free = child_task; + child_task = NULL; + team->task_running_count--; + if (--team->task_count == 0 + && gomp_team_barrier_waiting_for_tasks (&team->barrier)) + { + gomp_team_barrier_done (&team->barrier, state); + gomp_mutex_unlock (&team->task_lock); + gomp_team_barrier_wake (&team->barrier, 0); + } + } + } +} + +/* Called when encountering a taskwait directive. */ + +void +GOMP_taskwait (void) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_team *team = thr->ts.team; + struct gomp_task *task = thr->task; + struct gomp_task *child_task = NULL; + struct gomp_task *to_free = NULL; + + if (task == NULL || task->children == NULL) + return; + gomp_mutex_lock (&team->task_lock); + while (1) + { + if (task->children == NULL) + { + gomp_mutex_unlock (&team->task_lock); + if (to_free) + { + gomp_finish_task (to_free); + free (to_free); + } + return; + } + if (task->children->kind == GOMP_TASK_WAITING) + { + child_task = task->children; + task->children = child_task->next_child; + child_task->prev_queue->next_queue = child_task->next_queue; + child_task->next_queue->prev_queue = child_task->prev_queue; + if (team->task_queue == child_task) + { + if (child_task->next_queue != child_task) + team->task_queue = child_task->next_queue; + else + team->task_queue = NULL; + } + child_task->kind = GOMP_TASK_TIED; + team->task_running_count++; + if (team->task_count == team->task_running_count) + gomp_team_barrier_clear_task_pending (&team->barrier); + } + else + /* All tasks we are waiting for are already running + in other threads. Wait for them. */ + task->in_taskwait = true; + gomp_mutex_unlock (&team->task_lock); + if (to_free) + { + gomp_finish_task (to_free); + free (to_free); + to_free = NULL; + } + if (child_task) + { + thr->task = child_task; + child_task->fn (child_task->fn_data); + thr->task = task; + } + else + { + gomp_sem_wait (&task->taskwait_sem); + task->in_taskwait = false; + return; + } + gomp_mutex_lock (&team->task_lock); + if (child_task) + { + child_task->prev_child->next_child = child_task->next_child; + child_task->next_child->prev_child = child_task->prev_child; + if (task->children == child_task) + { + if (child_task->next_child != child_task) + task->children = child_task->next_child; + else + task->children = NULL; + } + gomp_clear_parent (child_task->children); + to_free = child_task; + child_task = NULL; + team->task_count--; + team->task_running_count--; + } + } +} diff --git a/libgomp/team.c b/libgomp/team.c index 7d50bfc29af..18b02e72f90 100644 --- a/libgomp/team.c +++ b/libgomp/team.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005, 2006, 2007 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2006, 2007, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -32,17 +32,12 @@ #include <stdlib.h> #include <string.h> -/* This array manages threads spawned from the top level, which will - return to the idle loop once the current PARALLEL construct ends. */ -static struct gomp_thread **gomp_threads; -static unsigned gomp_threads_size; -static unsigned gomp_threads_used; - /* This attribute contains PTHREAD_CREATE_DETACHED. */ pthread_attr_t gomp_thread_attr; -/* This barrier holds and releases threads waiting in gomp_threads. */ -static gomp_barrier_t gomp_threads_dock; +/* This key is for the thread destructor. */ +pthread_key_t gomp_thread_destructor; + /* This is the libgomp per-thread data structure. */ #ifdef HAVE_TLS @@ -56,9 +51,11 @@ pthread_key_t gomp_tls_key; struct gomp_thread_start_data { - struct gomp_team_state ts; void (*fn) (void *); void *fn_data; + struct gomp_team_state ts; + struct gomp_task *task; + struct gomp_thread_pool *thread_pool; bool nested; }; @@ -71,6 +68,7 @@ gomp_thread_start (void *xdata) { struct gomp_thread_start_data *data = xdata; struct gomp_thread *thr; + struct gomp_thread_pool *pool; void (*local_fn) (void *); void *local_data; @@ -86,43 +84,46 @@ gomp_thread_start (void *xdata) /* Extract what we need from data. */ local_fn = data->fn; local_data = data->fn_data; + thr->thread_pool = data->thread_pool; thr->ts = data->ts; + thr->task = data->task; thr->ts.team->ordered_release[thr->ts.team_id] = &thr->release; + /* Make thread pool local. */ + pool = thr->thread_pool; + if (data->nested) { - gomp_barrier_wait (&thr->ts.team->barrier); + struct gomp_team *team = thr->ts.team; + struct gomp_task *task = thr->task; + + gomp_barrier_wait (&team->barrier); + local_fn (local_data); - gomp_barrier_wait (&thr->ts.team->barrier); + gomp_team_barrier_wait (&team->barrier); + gomp_finish_task (task); + gomp_barrier_wait_last (&team->barrier); } else { - gomp_threads[thr->ts.team_id] = thr; + pool->threads[thr->ts.team_id] = thr; - gomp_barrier_wait (&gomp_threads_dock); + gomp_barrier_wait (&pool->threads_dock); do { - struct gomp_team *team; + struct gomp_team *team = thr->ts.team; + struct gomp_task *task = thr->task; local_fn (local_data); + gomp_team_barrier_wait (&team->barrier); + gomp_finish_task (task); - /* Clear out the team and function data. This is a debugging - signal that we're in fact back in the dock. */ - team = thr->ts.team; - thr->fn = NULL; - thr->data = NULL; - thr->ts.team = NULL; - thr->ts.work_share = NULL; - thr->ts.team_id = 0; - thr->ts.work_share_generation = 0; - thr->ts.static_trip = 0; - - gomp_barrier_wait (&team->barrier); - gomp_barrier_wait (&gomp_threads_dock); + gomp_barrier_wait (&pool->threads_dock); local_fn = thr->fn; local_data = thr->data; + thr->fn = NULL; } while (local_fn); } @@ -133,28 +134,43 @@ gomp_thread_start (void *xdata) /* Create a new team data structure. */ -static struct gomp_team * -new_team (unsigned nthreads, struct gomp_work_share *work_share) +struct gomp_team * +gomp_new_team (unsigned nthreads) { struct gomp_team *team; size_t size; + int i; - size = sizeof (*team) + nthreads * sizeof (team->ordered_release[0]); + size = sizeof (*team) + nthreads * (sizeof (team->ordered_release[0]) + + sizeof (team->implicit_task[0])); team = gomp_malloc (size); - gomp_mutex_init (&team->work_share_lock); - team->work_shares = gomp_malloc (4 * sizeof (struct gomp_work_share *)); - team->generation_mask = 3; - team->oldest_live_gen = work_share == NULL; - team->num_live_gen = work_share != NULL; - team->work_shares[0] = work_share; + team->work_share_chunk = 8; +#ifdef HAVE_SYNC_BUILTINS + team->single_count = 0; +#else + gomp_mutex_init (&team->work_share_list_free_lock); +#endif + gomp_init_work_share (&team->work_shares[0], false, nthreads); + team->work_shares[0].next_alloc = NULL; + team->work_share_list_free = NULL; + team->work_share_list_alloc = &team->work_shares[1]; + for (i = 1; i < 7; i++) + team->work_shares[i].next_free = &team->work_shares[i + 1]; + team->work_shares[i].next_free = NULL; team->nthreads = nthreads; gomp_barrier_init (&team->barrier, nthreads); gomp_sem_init (&team->master_release, 0); + team->ordered_release = (void *) &team->implicit_task[nthreads]; team->ordered_release[0] = &team->master_release; + gomp_mutex_init (&team->task_lock); + team->task_queue = NULL; + team->task_count = 0; + team->task_running_count = 0; + return team; } @@ -164,31 +180,98 @@ new_team (unsigned nthreads, struct gomp_work_share *work_share) static void free_team (struct gomp_team *team) { - free (team->work_shares); - gomp_mutex_destroy (&team->work_share_lock); gomp_barrier_destroy (&team->barrier); - gomp_sem_destroy (&team->master_release); + gomp_mutex_destroy (&team->task_lock); free (team); } +/* Allocate and initialize a thread pool. */ + +static struct gomp_thread_pool *gomp_new_thread_pool (void) +{ + struct gomp_thread_pool *pool + = gomp_malloc (sizeof(struct gomp_thread_pool)); + pool->threads = NULL; + pool->threads_size = 0; + pool->threads_used = 0; + pool->last_team = NULL; + return pool; +} + +static void +gomp_free_pool_helper (void *thread_pool) +{ + struct gomp_thread_pool *pool + = (struct gomp_thread_pool *) thread_pool; + gomp_barrier_wait_last (&pool->threads_dock); + pthread_exit (NULL); +} + +/* Free a thread pool and release its threads. */ + +static void +gomp_free_thread (void *arg __attribute__((unused))) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_thread_pool *pool = thr->thread_pool; + if (pool) + { + if (pool->threads_used > 0) + { + int i; + for (i = 1; i < pool->threads_used; i++) + { + struct gomp_thread *nthr = pool->threads[i]; + nthr->fn = gomp_free_pool_helper; + nthr->data = pool; + } + /* This barrier undocks threads docked on pool->threads_dock. */ + gomp_barrier_wait (&pool->threads_dock); + /* And this waits till all threads have called gomp_barrier_wait_last + in gomp_free_pool_helper. */ + gomp_barrier_wait (&pool->threads_dock); + /* Now it is safe to destroy the barrier and free the pool. */ + gomp_barrier_destroy (&pool->threads_dock); + } + free (pool->threads); + if (pool->last_team) + free_team (pool->last_team); + free (pool); + thr->thread_pool = NULL; + } + if (thr->task != NULL) + { + struct gomp_task *task = thr->task; + gomp_end_task (); + free (task); + } +} /* Launch a team. */ void gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads, - struct gomp_work_share *work_share) + struct gomp_team *team) { struct gomp_thread_start_data *start_data; struct gomp_thread *thr, *nthr; - struct gomp_team *team; + struct gomp_task *task; + struct gomp_task_icv *icv; bool nested; + struct gomp_thread_pool *pool; unsigned i, n, old_threads_used = 0; pthread_attr_t thread_attr, *attr; thr = gomp_thread (); nested = thr->ts.team != NULL; - - team = new_team (nthreads, work_share); + if (__builtin_expect (thr->thread_pool == NULL, 0)) + { + thr->thread_pool = gomp_new_thread_pool (); + pthread_setspecific (gomp_thread_destructor, thr); + } + pool = thr->thread_pool; + task = thr->task; + icv = task ? &task->icv : &gomp_global_icv; /* Always save the previous state, even if this isn't a nested team. In particular, we should save any work share state from an outer @@ -196,10 +279,18 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads, team->prev_ts = thr->ts; thr->ts.team = team; - thr->ts.work_share = work_share; thr->ts.team_id = 0; - thr->ts.work_share_generation = 0; + ++thr->ts.level; + if (nthreads > 1) + ++thr->ts.active_level; + thr->ts.work_share = &team->work_shares[0]; + thr->ts.last_work_share = NULL; +#ifdef HAVE_SYNC_BUILTINS + thr->ts.single_count = 0; +#endif thr->ts.static_trip = 0; + thr->task = &team->implicit_task[0]; + gomp_init_task (thr->task, task, icv); if (nthreads == 1) return; @@ -213,14 +304,14 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads, only the initial program thread will modify gomp_threads. */ if (!nested) { - old_threads_used = gomp_threads_used; + old_threads_used = pool->threads_used; if (nthreads <= old_threads_used) n = nthreads; else if (old_threads_used == 0) { n = 0; - gomp_barrier_init (&gomp_threads_dock, nthreads); + gomp_barrier_init (&pool->threads_dock, nthreads); } else { @@ -228,23 +319,30 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads, /* Increase the barrier threshold to make sure all new threads arrive before the team is released. */ - gomp_barrier_reinit (&gomp_threads_dock, nthreads); + gomp_barrier_reinit (&pool->threads_dock, nthreads); } /* Not true yet, but soon will be. We're going to release all - threads from the dock, and those that aren't part of the + threads from the dock, and those that aren't part of the team will exit. */ - gomp_threads_used = nthreads; + pool->threads_used = nthreads; /* Release existing idle threads. */ for (; i < n; ++i) { - nthr = gomp_threads[i]; + nthr = pool->threads[i]; nthr->ts.team = team; - nthr->ts.work_share = work_share; + nthr->ts.work_share = &team->work_shares[0]; + nthr->ts.last_work_share = NULL; nthr->ts.team_id = i; - nthr->ts.work_share_generation = 0; + nthr->ts.level = team->prev_ts.level + 1; + nthr->ts.active_level = thr->ts.active_level; +#ifdef HAVE_SYNC_BUILTINS + nthr->ts.single_count = 0; +#endif nthr->ts.static_trip = 0; + nthr->task = &team->implicit_task[i]; + gomp_init_task (nthr->task, task, icv); nthr->fn = fn; nthr->data = data; team->ordered_release[i] = &nthr->release; @@ -254,20 +352,36 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads, goto do_release; /* If necessary, expand the size of the gomp_threads array. It is - expected that changes in the number of threads is rare, thus we + expected that changes in the number of threads are rare, thus we make no effort to expand gomp_threads_size geometrically. */ - if (nthreads >= gomp_threads_size) + if (nthreads >= pool->threads_size) { - gomp_threads_size = nthreads + 1; - gomp_threads - = gomp_realloc (gomp_threads, - gomp_threads_size + pool->threads_size = nthreads + 1; + pool->threads + = gomp_realloc (pool->threads, + pool->threads_size * sizeof (struct gomp_thread_data *)); } } + if (__builtin_expect (nthreads > old_threads_used, 0)) + { + long diff = (long) nthreads - (long) old_threads_used; + + if (old_threads_used == 0) + --diff; + +#ifdef HAVE_SYNC_BUILTINS + __sync_fetch_and_add (&gomp_managed_threads, diff); +#else + gomp_mutex_lock (&gomp_remaining_threads_lock); + gomp_managed_threads += diff; + gomp_mutex_unlock (&gomp_remaining_threads_lock); +#endif + } + attr = &gomp_thread_attr; - if (gomp_cpu_affinity != NULL) + if (__builtin_expect (gomp_cpu_affinity != NULL, 0)) { size_t stacksize; pthread_attr_init (&thread_attr); @@ -286,13 +400,21 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads, pthread_t pt; int err; + start_data->fn = fn; + start_data->fn_data = data; start_data->ts.team = team; - start_data->ts.work_share = work_share; + start_data->ts.work_share = &team->work_shares[0]; + start_data->ts.last_work_share = NULL; start_data->ts.team_id = i; - start_data->ts.work_share_generation = 0; + start_data->ts.level = team->prev_ts.level + 1; + start_data->ts.active_level = thr->ts.active_level; +#ifdef HAVE_SYNC_BUILTINS + start_data->ts.single_count = 0; +#endif start_data->ts.static_trip = 0; - start_data->fn = fn; - start_data->fn_data = data; + start_data->task = &team->implicit_task[i]; + gomp_init_task (start_data->task, task, icv); + start_data->thread_pool = pool; start_data->nested = nested; if (gomp_cpu_affinity != NULL) @@ -303,18 +425,30 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads, gomp_fatal ("Thread creation failed: %s", strerror (err)); } - if (gomp_cpu_affinity != NULL) + if (__builtin_expect (gomp_cpu_affinity != NULL, 0)) pthread_attr_destroy (&thread_attr); do_release: - gomp_barrier_wait (nested ? &team->barrier : &gomp_threads_dock); + gomp_barrier_wait (nested ? &team->barrier : &pool->threads_dock); /* Decrease the barrier threshold to match the number of threads that should arrive back at the end of this team. The extra threads should be exiting. Note that we arrange for this test to never be true for nested teams. */ - if (nthreads < old_threads_used) - gomp_barrier_reinit (&gomp_threads_dock, nthreads); + if (__builtin_expect (nthreads < old_threads_used, 0)) + { + long diff = (long) nthreads - (long) old_threads_used; + + gomp_barrier_reinit (&pool->threads_dock, nthreads); + +#ifdef HAVE_SYNC_BUILTINS + __sync_fetch_and_add (&gomp_managed_threads, diff); +#else + gomp_mutex_lock (&gomp_remaining_threads_lock); + gomp_managed_threads += diff; + gomp_mutex_unlock (&gomp_remaining_threads_lock); +#endif + } } @@ -327,11 +461,52 @@ gomp_team_end (void) struct gomp_thread *thr = gomp_thread (); struct gomp_team *team = thr->ts.team; - gomp_barrier_wait (&team->barrier); + /* This barrier handles all pending explicit threads. */ + gomp_team_barrier_wait (&team->barrier); + gomp_fini_work_share (thr->ts.work_share); + gomp_end_task (); thr->ts = team->prev_ts; - free_team (team); + if (__builtin_expect (thr->ts.team != NULL, 0)) + { +#ifdef HAVE_SYNC_BUILTINS + __sync_fetch_and_add (&gomp_managed_threads, 1L - team->nthreads); +#else + gomp_mutex_lock (&gomp_remaining_threads_lock); + gomp_managed_threads -= team->nthreads - 1L; + gomp_mutex_unlock (&gomp_remaining_threads_lock); +#endif + /* This barrier has gomp_barrier_wait_last counterparts + and ensures the team can be safely destroyed. */ + gomp_barrier_wait (&team->barrier); + } + + if (__builtin_expect (team->work_shares[0].next_alloc != NULL, 0)) + { + struct gomp_work_share *ws = team->work_shares[0].next_alloc; + do + { + struct gomp_work_share *next_ws = ws->next_alloc; + free (ws); + ws = next_ws; + } + while (ws != NULL); + } + gomp_sem_destroy (&team->master_release); +#ifndef HAVE_SYNC_BUILTINS + gomp_mutex_destroy (&team->work_share_list_free_lock); +#endif + + if (__builtin_expect (thr->ts.team != NULL, 0)) + free_team (team); + else + { + struct gomp_thread_pool *pool = thr->thread_pool; + if (pool->last_team) + free_team (pool->last_team); + pool->last_team = team; + } } @@ -349,6 +524,9 @@ initialize_team (void) pthread_setspecific (gomp_tls_key, &initial_thread_tls_data); #endif + if (pthread_key_create (&gomp_thread_destructor, gomp_free_thread) != 0) + gomp_fatal ("could not create thread pool destructor."); + #ifdef HAVE_TLS thr = &gomp_tls_data; #else @@ -356,3 +534,22 @@ initialize_team (void) #endif gomp_sem_init (&thr->release, 0); } + +static void __attribute__((destructor)) +team_destructor (void) +{ + /* Without this dlclose on libgomp could lead to subsequent + crashes. */ + pthread_key_delete (gomp_thread_destructor); +} + +struct gomp_task_icv * +gomp_new_icv (void) +{ + struct gomp_thread *thr = gomp_thread (); + struct gomp_task *task = gomp_malloc (sizeof (struct gomp_task)); + gomp_init_task (task, NULL, &gomp_global_icv); + thr->task = task; + pthread_setspecific (gomp_thread_destructor, thr); + return &task->icv; +} diff --git a/libgomp/testsuite/Makefile.in b/libgomp/testsuite/Makefile.in index 9c6163ba2bf..ae1806fb2da 100644 --- a/libgomp/testsuite/Makefile.in +++ b/libgomp/testsuite/Makefile.in @@ -112,9 +112,15 @@ MAINTAINER_MODE_TRUE = @MAINTAINER_MODE_TRUE@ MAKEINFO = @MAKEINFO@ NM = @NM@ OBJEXT = @OBJEXT@ +OMP_LOCK_25_ALIGN = @OMP_LOCK_25_ALIGN@ +OMP_LOCK_25_KIND = @OMP_LOCK_25_KIND@ +OMP_LOCK_25_SIZE = @OMP_LOCK_25_SIZE@ OMP_LOCK_ALIGN = @OMP_LOCK_ALIGN@ OMP_LOCK_KIND = @OMP_LOCK_KIND@ OMP_LOCK_SIZE = @OMP_LOCK_SIZE@ +OMP_NEST_LOCK_25_ALIGN = @OMP_NEST_LOCK_25_ALIGN@ +OMP_NEST_LOCK_25_KIND = @OMP_NEST_LOCK_25_KIND@ +OMP_NEST_LOCK_25_SIZE = @OMP_NEST_LOCK_25_SIZE@ OMP_NEST_LOCK_ALIGN = @OMP_NEST_LOCK_ALIGN@ OMP_NEST_LOCK_KIND = @OMP_NEST_LOCK_KIND@ OMP_NEST_LOCK_SIZE = @OMP_NEST_LOCK_SIZE@ diff --git a/libgomp/testsuite/libgomp.c++/c++.exp b/libgomp/testsuite/libgomp.c++/c++.exp index f11482c7315..f3f42de6619 100644 --- a/libgomp/testsuite/libgomp.c++/c++.exp +++ b/libgomp/testsuite/libgomp.c++/c++.exp @@ -31,8 +31,15 @@ if { $lang_test_file_found } { set ld_library_path "$always_ld_library_path:${blddir}/${lang_library_path}" set_ld_library_path_env_vars + set flags_file "${blddir}/../libstdc++-v3/scripts/testsuite_flags" + if { [file exists $flags_file] } { + set libstdcxx_includes [exec sh $flags_file --build-includes] + } else { + set libstdcxx_includes "" + } + # Main loop. - gfortran-dg-runtest $tests "" + gfortran-dg-runtest $tests $libstdcxx_includes } # All done. diff --git a/libgomp/testsuite/libgomp.c++/collapse-1.C b/libgomp/testsuite/libgomp.c++/collapse-1.C new file mode 100644 index 00000000000..132d35cf41d --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/collapse-1.C @@ -0,0 +1,29 @@ +// { dg-do run } + +#include <string.h> +#include <stdlib.h> + +int +main () +{ + int i, j, k, l = 0; + int a[3][3][3]; + + memset (a, '\0', sizeof (a)); + #pragma omp parallel for collapse(4 - 1) schedule(static, 4) + for (i = 0; i < 2; i++) + for (j = 0; j < 2; j++) + for (k = 0; k < 2; k++) + a[i][j][k] = i + j * 4 + k * 16; + #pragma omp parallel + { + #pragma omp for collapse(2) reduction(|:l) private (k) + for (i = 0; i < 2; i++) + for (j = 0; j < 2; j++) + for (k = 0; k < 2; k++) + if (a[i][j][k] != i + j * 4 + k * 16) + l = 1; + } + if (l) + abort (); +} diff --git a/libgomp/testsuite/libgomp.c++/collapse-2.C b/libgomp/testsuite/libgomp.c++/collapse-2.C new file mode 100644 index 00000000000..a42a1f07ffd --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/collapse-2.C @@ -0,0 +1,371 @@ +// { dg-do run } + +#include <omp.h> +typedef __PTRDIFF_TYPE__ ptrdiff_t; +extern "C" void abort (); + +template <typename T> +class I +{ +public: + typedef ptrdiff_t difference_type; + I (); + ~I (); + I (T *); + I (const I &); + T &operator * (); + T *operator -> (); + T &operator [] (const difference_type &) const; + I &operator = (const I &); + I &operator ++ (); + I operator ++ (int); + I &operator -- (); + I operator -- (int); + I &operator += (const difference_type &); + I &operator -= (const difference_type &); + I operator + (const difference_type &) const; + I operator - (const difference_type &) const; + template <typename S> friend bool operator == (I<S> &, I<S> &); + template <typename S> friend bool operator == (const I<S> &, const I<S> &); + template <typename S> friend bool operator < (I<S> &, I<S> &); + template <typename S> friend bool operator < (const I<S> &, const I<S> &); + template <typename S> friend bool operator <= (I<S> &, I<S> &); + template <typename S> friend bool operator <= (const I<S> &, const I<S> &); + template <typename S> friend bool operator > (I<S> &, I<S> &); + template <typename S> friend bool operator > (const I<S> &, const I<S> &); + template <typename S> friend bool operator >= (I<S> &, I<S> &); + template <typename S> friend bool operator >= (const I<S> &, const I<S> &); + template <typename S> friend typename I<S>::difference_type operator - (I<S> &, I<S> &); + template <typename S> friend typename I<S>::difference_type operator - (const I<S> &, const I<S> &); + template <typename S> friend I<S> operator + (typename I<S>::difference_type , const I<S> &); +private: + T *p; +}; +template <typename T> I<T>::I () : p (0) {} +template <typename T> I<T>::~I () { p = (T *) 0; } +template <typename T> I<T>::I (T *x) : p (x) {} +template <typename T> I<T>::I (const I &x) : p (x.p) {} +template <typename T> T &I<T>::operator * () { return *p; } +template <typename T> T *I<T>::operator -> () { return p; } +template <typename T> T &I<T>::operator [] (const difference_type &x) const { return p[x]; } +template <typename T> I<T> &I<T>::operator = (const I &x) { p = x.p; return *this; } +template <typename T> I<T> &I<T>::operator ++ () { ++p; return *this; } +template <typename T> I<T> I<T>::operator ++ (int) { return I (p++); } +template <typename T> I<T> &I<T>::operator -- () { --p; return *this; } +template <typename T> I<T> I<T>::operator -- (int) { return I (p--); } +template <typename T> I<T> &I<T>::operator += (const difference_type &x) { p += x; return *this; } +template <typename T> I<T> &I<T>::operator -= (const difference_type &x) { p -= x; return *this; } +template <typename T> I<T> I<T>::operator + (const difference_type &x) const { return I (p + x); } +template <typename T> I<T> I<T>::operator - (const difference_type &x) const { return I (p - x); } +template <typename T> bool operator == (I<T> &x, I<T> &y) { return x.p == y.p; } +template <typename T> bool operator == (const I<T> &x, const I<T> &y) { return x.p == y.p; } +template <typename T> bool operator != (I<T> &x, I<T> &y) { return !(x == y); } +template <typename T> bool operator != (const I<T> &x, const I<T> &y) { return !(x == y); } +template <typename T> bool operator < (I<T> &x, I<T> &y) { return x.p < y.p; } +template <typename T> bool operator < (const I<T> &x, const I<T> &y) { return x.p < y.p; } +template <typename T> bool operator <= (I<T> &x, I<T> &y) { return x.p <= y.p; } +template <typename T> bool operator <= (const I<T> &x, const I<T> &y) { return x.p <= y.p; } +template <typename T> bool operator > (I<T> &x, I<T> &y) { return x.p > y.p; } +template <typename T> bool operator > (const I<T> &x, const I<T> &y) { return x.p > y.p; } +template <typename T> bool operator >= (I<T> &x, I<T> &y) { return x.p >= y.p; } +template <typename T> bool operator >= (const I<T> &x, const I<T> &y) { return x.p >= y.p; } +template <typename T> typename I<T>::difference_type operator - (I<T> &x, I<T> &y) { return x.p - y.p; } +template <typename T> typename I<T>::difference_type operator - (const I<T> &x, const I<T> &y) { return x.p - y.p; } +template <typename T> I<T> operator + (typename I<T>::difference_type x, const I<T> &y) { return I<T> (x + y.p); } + +template <typename T> +class J +{ +public: + J(const I<T> &x, const I<T> &y) : b (x), e (y) {} + const I<T> &begin (); + const I<T> &end (); +private: + I<T> b, e; +}; + +template <typename T> const I<T> &J<T>::begin () { return b; } +template <typename T> const I<T> &J<T>::end () { return e; } + +int results[2000]; + +void +f1 (J<int> x, J<int> y, J<int> z) +{ + I<int> i, j, k; + int l, f = 0, n = 0, m = 0; +#pragma omp parallel shared (i, j, k, l) firstprivate (f) \ + reduction (+:n, m) num_threads (8) + { + #pragma omp for lastprivate (i, j, k, l) schedule (static, 9) \ + collapse (4) + for (i = x.begin (); i < x.end (); ++i) + for (j = y.begin (); j <= y.end (); j += 1) + for (l = 0; l < 1; l++) + for (k = z.begin () + 3; k < z.end () - 3; k++) + if (omp_get_num_threads () == 8 + && ((*i + 2) * 12 + (*j + 5) * 4 + (*k - 13) + != (omp_get_thread_num () * 9 + f++))) + n++; + else + m++; + } + if (n || i != x.end () || j != y.end () + 1 || k != z.end () - 3 + || m != 72 || l != 1) + abort (); +} + +void +f2 (J<int> x, J<int> y, J<int> z) +{ + int f = 0, n = 0, m = 0; +#pragma omp parallel for firstprivate (f) reduction (+:n, m) \ + num_threads (8) schedule (static, 9) \ + collapse (6 - 2) + for (I<int> i = x.end () - 1; i >= x.begin (); --i) + for (int l = -131; l >= -131; l--) + for (I<int> j = y.end (); j > y.begin () - 1; j -= 1) + { + for (I<int> k = z.end () - 4; k >= z.begin () + 3; k--) + if (omp_get_num_threads () == 8 + && ((3 - *i) * 12 + (-3 - *j) * 4 + (16 - *k) + != (omp_get_thread_num () * 9 + f++))) + n++; + else + m++; + } + if (n || m != 72) + abort (); +} + +template <typename T> +void +f3 (J<int> x, J<int> y, J<int> z) +{ + I<int> i, j, k; + int l, f = 0, n = 0, m = 0; +#pragma omp parallel shared (i, j, k, l) firstprivate (f) \ + reduction (+:n, m) num_threads (8) + { + #pragma omp for lastprivate (i, j, k, l) schedule (static, 9) \ + collapse (4) + for (i = x.begin (); i < x.end (); ++i) + for (j = y.begin (); j <= y.end (); j += 1) + for (k = z.begin () + 3; k < z.end () - 3; k++) + for (l = 7; l <= 7; l++) + if (omp_get_num_threads () == 8 + && ((*i + 2) * 12 + (*j + 5) * 4 + (*k - 13) + != (omp_get_thread_num () * 9 + f++))) + n++; + else + m++; + } + if (n || i != x.end () || j != y.end () + 1 || k != z.end () - 3 + || m != 72 || l != 8) + abort (); +} + +template <typename T> +void +f4 (J<int> x, J<int> y, J<int> z) +{ + int f = 0, n = 0, m = 0; +#pragma omp parallel for firstprivate (f) reduction (+:n, m) \ + num_threads (8) schedule (static, 9) \ + collapse (5 - 2) + for (I<int> i = x.end () - 1; i >= x.begin (); --i) + { + for (I<int> j = y.end (); j > y.begin () - 1; j -= 1) + { + for (I<int> k = z.end () - 4; k >= z.begin () + 3; k--) + if (omp_get_num_threads () == 8 + && ((3 - *i) * 12 + (-3 - *j) * 4 + (16 - *k) + != (omp_get_thread_num () * 9 + f++))) + n++; + else + m++; + } + } + if (n || m != 72) + abort (); +} + +template <typename T> +void +f5 (J<int> x, J<int> y, J<int> z) +{ + I<int> i, j, k; + int f = 0, n = 0, m = 0; +#pragma omp parallel shared (i, j, k) firstprivate (f) \ + reduction (+:n, m) num_threads (8) + { + #pragma omp for lastprivate (i, j, k) schedule (static, 9) \ + collapse (3) + for (i = x.begin (); i < x.end (); ++i) + for (j = y.begin (); j <= y.end (); j += (T) 1) + { + for (k = z.begin () + 3; k < z.end () - 3; k++) + if (omp_get_num_threads () == 8 + && ((*i + 2) * 12 + (*j + 5) * 4 + (*k - 13) + != (omp_get_thread_num () * 9 + f++))) + n++; + else + m++; + } + } + if (n || i != x.end () || j != y.end () + 1 || k != z.end () - 3 + || m != 72) + abort (); +} + +template <typename T> +void +f6 (J<int> x, J<int> y, J<int> z) +{ + int f = 0, n = 0, m = 0; +#pragma omp parallel for firstprivate (f) reduction (+:n, m) \ + num_threads (8) schedule (static, 9) \ + collapse (5 - 2) + for (I<int> i = x.end () - 1; i >= x.begin (); --i) + { + for (I<int> j = y.end (); j > y.begin () - 1; j -= 1) + { + for (I<int> k = z.end () - 4; k >= z.begin () + (T) 3; k--) + if (omp_get_num_threads () == 8 + && ((3 - *i) * 12 + (-3 - *j) * 4 + (16 - *k) + != (omp_get_thread_num () * 9 + f++))) + n++; + else + m++; + } + } + if (n || m != 72) + abort (); +} + +template <typename T> +void +f7 (J<T> x, J<T> y, J<T> z) +{ + I<T> i, j, k, o = y.begin (); + T l, f = 0, n = 0, m = 0; +#pragma omp parallel shared (i, j, k, l) firstprivate (f) \ + reduction (+:n, m) num_threads (8) + { + #pragma omp for lastprivate (i, j, k, l) schedule (static, 9) \ + collapse (4) + for (i = x.begin (); i < x.end (); ++i) + for (j = y.begin (); j <= y.end (); j += 1) + for (l = *o; l <= *o; l = 1 + l) + for (k = z.begin () + 3; k < z.end () - 3; k++) + if (omp_get_num_threads () == 8 + && ((*i + 2) * 12 + (*j + 5) * 4 + (*k - 13) + != (omp_get_thread_num () * 9 + f++))) + n++; + else + m++; + } + if (n || i != x.end () || j != y.end () + 1 || k != z.end () - 3 + || m != 72 || l != *o + 1) + abort (); +} + +template <typename T> +void +f8 (J<T> x, J<T> y, J<T> z) +{ + T f = 0, n = 0, m = 0; +#pragma omp parallel for firstprivate (f) reduction (+:n, m) \ + num_threads (8) schedule (static, 9) \ + collapse (6 - 2) + for (I<T> i = x.end () - 1; i >= x.begin (); --i) + for (T l = 0; l < 1; l++) + for (I<T> j = y.end (); j > y.begin () - 1; j -= 1) + { + for (I<T> k = z.end () - 4; k >= z.begin () + 3; k--) + if (omp_get_num_threads () == 8 + && ((3 - *i) * 12 + (-3 - *j) * 4 + (16 - *k) + != (omp_get_thread_num () * 9 + f++))) + n++; + else + m++; + } + if (n || m != 72) + abort (); +} + +template <typename S, typename T> +void +f9 (J<T> x, J<T> y, J<T> z) +{ + S i, j, k, o = y.begin (); + T l, f = 0, n = 0, m = 0; +#pragma omp parallel shared (i, j, k, l) firstprivate (f) \ + reduction (+:n, m) num_threads (8) + { + #pragma omp for lastprivate (i, j, k, l) schedule (static, 9) \ + collapse (4) + for (i = x.begin (); i < x.end (); ++i) + for (j = y.begin (); j <= y.end (); j += 1) + for (l = *o; l <= *o; l = 1 + l) + for (k = z.begin () + 3; k < z.end () - 3; k++) + if (omp_get_num_threads () == 8 + && ((*i + 2) * 12 + (*j + 5) * 4 + (*k - 13) + != (omp_get_thread_num () * 9 + f++))) + n++; + else + m++; + } + if (n || i != x.end () || j != y.end () + 1 || k != z.end () - 3 + || m != 72 || l != *o + 1) + abort (); +} + +template <typename S, typename T> +void +f10 (J<T> x, J<T> y, J<T> z) +{ + T f = 0, n = 0, m = 0; +#pragma omp parallel for firstprivate (f) reduction (+:n, m) \ + num_threads (8) schedule (static, 9) \ + collapse (6 - 2) + for (S i = x.end () - 1; i >= x.begin (); --i) + for (T l = 0; l < 1; l++) + for (S j = y.end (); j > y.begin () - 1; j -= 1) + { + for (S k = z.end () - 4; k >= z.begin () + 3; k--) + if (omp_get_num_threads () == 8 + && ((3 - *i) * 12 + (-3 - *j) * 4 + (16 - *k) + != (omp_get_thread_num () * 9 + f++))) + n++; + else + m++; + } + if (n || m != 72) + abort (); +} + +int +main () +{ + int a[2000]; + long b[2000]; + for (int i = 0; i < 2000; i++) + { + a[i] = i - 1000; + b[i] = i - 1000; + } + J<int> x (&a[998], &a[1004]); + J<int> y (&a[995], &a[997]); + J<int> z (&a[1010], &a[1020]); + f1 (x, y, z); + f2 (x, y, z); + f3 <int> (x, y, z); + f4 <int> (x, y, z); + f5 <int> (x, y, z); + f6 <int> (x, y, z); + f7 <int> (x, y, z); + f8 <int> (x, y, z); + f9 <I<int>, int> (x, y, z); + f10 <I<int>, int> (x, y, z); +} diff --git a/libgomp/testsuite/libgomp.c++/ctor-10.C b/libgomp/testsuite/libgomp.c++/ctor-10.C new file mode 100644 index 00000000000..f46e45ec418 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/ctor-10.C @@ -0,0 +1,78 @@ +// { dg-do run } +// { dg-require-effective-target tls_runtime } + +#include <omp.h> +#include <assert.h> + +#define N 10 +#define THR 4 + +struct B +{ + B(); + B(const B &); + ~B(); + B& operator=(const B &); + void doit(); + static B *base; + static B *threadbase; +#pragma omp threadprivate(threadbase) +}; + +B *B::base; +B *B::threadbase; +static unsigned cmask[THR]; +static unsigned dmask[THR]; + +B::B() +{ + assert (base == 0); +} + +B::B(const B &b) +{ + unsigned index = &b - base; + assert (index < N); + cmask[omp_get_thread_num()] |= 1u << index; +} + +B::~B() +{ + if (threadbase) + { + unsigned index = this - threadbase; + assert (index < N); + dmask[omp_get_thread_num()] |= 1u << index; + } +} + +void foo() +{ + B b[N]; + + B::base = b; + + #pragma omp parallel firstprivate(b) + { + assert (omp_get_num_threads () == THR); + B::threadbase = b; + } + + B::threadbase = 0; +} + +int main() +{ + omp_set_dynamic (0); + omp_set_num_threads (THR); + foo(); + + for (int i = 0; i < THR; ++i) + { + unsigned xmask = (1u << N) - 1; + assert (cmask[i] == xmask); + assert (dmask[i] == xmask); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/for-1.C b/libgomp/testsuite/libgomp.c++/for-1.C new file mode 100644 index 00000000000..1c713464ebe --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/for-1.C @@ -0,0 +1,291 @@ +// { dg-do run } + +typedef __PTRDIFF_TYPE__ ptrdiff_t; +extern "C" void abort (); + +template <typename T> +class I +{ +public: + typedef ptrdiff_t difference_type; + I (); + ~I (); + I (T *); + I (const I &); + T &operator * (); + T *operator -> (); + T &operator [] (const difference_type &) const; + I &operator = (const I &); + I &operator ++ (); + I operator ++ (int); + I &operator -- (); + I operator -- (int); + I &operator += (const difference_type &); + I &operator -= (const difference_type &); + I operator + (const difference_type &) const; + I operator - (const difference_type &) const; + template <typename S> friend bool operator == (I<S> &, I<S> &); + template <typename S> friend bool operator == (const I<S> &, const I<S> &); + template <typename S> friend bool operator < (I<S> &, I<S> &); + template <typename S> friend bool operator < (const I<S> &, const I<S> &); + template <typename S> friend bool operator <= (I<S> &, I<S> &); + template <typename S> friend bool operator <= (const I<S> &, const I<S> &); + template <typename S> friend bool operator > (I<S> &, I<S> &); + template <typename S> friend bool operator > (const I<S> &, const I<S> &); + template <typename S> friend bool operator >= (I<S> &, I<S> &); + template <typename S> friend bool operator >= (const I<S> &, const I<S> &); + template <typename S> friend typename I<S>::difference_type operator - (I<S> &, I<S> &); + template <typename S> friend typename I<S>::difference_type operator - (const I<S> &, const I<S> &); + template <typename S> friend I<S> operator + (typename I<S>::difference_type , const I<S> &); +private: + T *p; +}; +template <typename T> I<T>::I () : p (0) {} +template <typename T> I<T>::~I () {} +template <typename T> I<T>::I (T *x) : p (x) {} +template <typename T> I<T>::I (const I &x) : p (x.p) {} +template <typename T> T &I<T>::operator * () { return *p; } +template <typename T> T *I<T>::operator -> () { return p; } +template <typename T> T &I<T>::operator [] (const difference_type &x) const { return p[x]; } +template <typename T> I<T> &I<T>::operator = (const I &x) { p = x.p; return *this; } +template <typename T> I<T> &I<T>::operator ++ () { ++p; return *this; } +template <typename T> I<T> I<T>::operator ++ (int) { return I (p++); } +template <typename T> I<T> &I<T>::operator -- () { --p; return *this; } +template <typename T> I<T> I<T>::operator -- (int) { return I (p--); } +template <typename T> I<T> &I<T>::operator += (const difference_type &x) { p += x; return *this; } +template <typename T> I<T> &I<T>::operator -= (const difference_type &x) { p -= x; return *this; } +template <typename T> I<T> I<T>::operator + (const difference_type &x) const { return I (p + x); } +template <typename T> I<T> I<T>::operator - (const difference_type &x) const { return I (p - x); } +template <typename T> bool operator == (I<T> &x, I<T> &y) { return x.p == y.p; } +template <typename T> bool operator == (const I<T> &x, const I<T> &y) { return x.p == y.p; } +template <typename T> bool operator != (I<T> &x, I<T> &y) { return !(x == y); } +template <typename T> bool operator != (const I<T> &x, const I<T> &y) { return !(x == y); } +template <typename T> bool operator < (I<T> &x, I<T> &y) { return x.p < y.p; } +template <typename T> bool operator < (const I<T> &x, const I<T> &y) { return x.p < y.p; } +template <typename T> bool operator <= (I<T> &x, I<T> &y) { return x.p <= y.p; } +template <typename T> bool operator <= (const I<T> &x, const I<T> &y) { return x.p <= y.p; } +template <typename T> bool operator > (I<T> &x, I<T> &y) { return x.p > y.p; } +template <typename T> bool operator > (const I<T> &x, const I<T> &y) { return x.p > y.p; } +template <typename T> bool operator >= (I<T> &x, I<T> &y) { return x.p >= y.p; } +template <typename T> bool operator >= (const I<T> &x, const I<T> &y) { return x.p >= y.p; } +template <typename T> typename I<T>::difference_type operator - (I<T> &x, I<T> &y) { return x.p - y.p; } +template <typename T> typename I<T>::difference_type operator - (const I<T> &x, const I<T> &y) { return x.p - y.p; } +template <typename T> I<T> operator + (typename I<T>::difference_type x, const I<T> &y) { return I<T> (x + y.p); } + +template <typename T> +class J +{ +public: + J(const I<T> &x, const I<T> &y) : b (x), e (y) {} + const I<T> &begin (); + const I<T> &end (); +private: + I<T> b, e; +}; + +template <typename T> const I<T> &J<T>::begin () { return b; } +template <typename T> const I<T> &J<T>::end () { return e; } + +int results[2000]; + +template <typename T> +void +baz (I<T> &i) +{ + if (*i < 0 || *i >= 2000) + abort (); + results[*i]++; +} + +void +f1 (const I<int> &x, const I<int> &y) +{ +#pragma omp parallel for + for (I<int> i = x; i <= y; i += 6) + baz (i); +} + +void +f2 (const I<int> &x, const I<int> &y) +{ + I<int> i; +#pragma omp parallel for private(i) + for (i = x; i < y - 1; i = 1 - 6 + 7 + i) + baz (i); +} + +template <typename T> +void +f3 (const I<int> &x, const I<int> &y) +{ +#pragma omp parallel for + for (I<int> i = x; i <= y; i = i + 9 - 8) + baz (i); +} + +template <typename T> +void +f4 (const I<int> &x, const I<int> &y) +{ + I<int> i; +#pragma omp parallel for lastprivate(i) + for (i = x + 2000 - 64; i > y + 10; --i) + baz (i); +} + +void +f5 (const I<int> &x, const I<int> &y) +{ +#pragma omp parallel for + for (I<int> i = x + 2000 - 64; i > y + 10; i -= 10) + baz (i); +} + +template <int N> +void +f6 (const I<int> &x, const I<int> &y) +{ +#pragma omp parallel for + for (I<int> i = x + 2000 - 64; i > y + 10; i = i - 12 + 2) + { + I<int> j = i + N; + baz (j); + } +} + +template <int N> +void +f7 (I<int> i, const I<int> &x, const I<int> &y) +{ +#pragma omp parallel for + for (i = x - 10; i <= y + 10; i += N) + baz (i); +} + +template <int N> +void +f8 (J<int> j) +{ + I<int> i; +#pragma omp parallel for + for (i = j.begin (); i <= j.end () + N; i += 2) + baz (i); +} + +template <typename T, int N> +void +f9 (const I<T> &x, const I<T> &y) +{ +#pragma omp parallel for + for (I<T> i = x; i <= y; i = i + N) + baz (i); +} + +template <typename T, int N> +void +f10 (const I<T> &x, const I<T> &y) +{ + I<T> i; +#pragma omp parallel for + for (i = x; i > y; i = i + N) + baz (i); +} + +template <typename T> +void +f11 (const T &x, const T &y) +{ +#pragma omp parallel + { +#pragma omp for nowait + for (T i = x; i <= y; i += 3) + baz (i); +#pragma omp single + { + T j = y + 3; + baz (j); + } + } +} + +template <typename T> +void +f12 (const T &x, const T &y) +{ + T i; +#pragma omp parallel for + for (i = x; i > y; --i) + baz (i); +} + +template <int N> +struct K +{ + template <typename T> + static void + f13 (const T &x, const T &y) + { +#pragma omp parallel for + for (T i = x; i <= y + N; i += N) + baz (i); + } +}; + +#define check(expr) \ + for (int i = 0; i < 2000; i++) \ + if (expr) \ + { \ + if (results[i] != 1) \ + abort (); \ + results[i] = 0; \ + } \ + else if (results[i]) \ + abort () + +int +main () +{ + int a[2000]; + long b[2000]; + for (int i = 0; i < 2000; i++) + { + a[i] = i; + b[i] = i; + } + f1 (&a[10], &a[1990]); + check (i >= 10 && i <= 1990 && (i - 10) % 6 == 0); + f2 (&a[0], &a[1999]); + check (i < 1998 && (i & 1) == 0); + f3<char> (&a[20], &a[1837]); + check (i >= 20 && i <= 1837); + f4<int> (&a[0], &a[30]); + check (i > 40 && i <= 2000 - 64); + f5 (&a[0], &a[100]); + check (i >= 116 && i <= 2000 - 64 && (i - 116) % 10 == 0); + f6<-10> (&a[10], &a[110]); + check (i >= 116 && i <= 2000 - 64 && (i - 116) % 10 == 0); + f7<6> (I<int> (), &a[12], &a[1800]); + check (i >= 2 && i <= 1808 && (i - 2) % 6 == 0); + f8<121> (J<int> (&a[14], &a[1803])); + check (i >= 14 && i <= 1924 && (i & 1) == 0); + f9<int, 7> (&a[33], &a[1967]); + check (i >= 33 && i <= 1967 && (i - 33) % 7 == 0); + f10<int, -7> (&a[1939], &a[17]); + check (i >= 21 && i <= 1939 && (i - 21) % 7 == 0); + f11<I<int> > (&a[16], &a[1981]); + check (i >= 16 && i <= 1984 && (i - 16) % 3 == 0); + f12<I<int> > (&a[1761], &a[37]); + check (i > 37 && i <= 1761); + K<5>::f13<I<int> > (&a[1], &a[1935]); + check (i >= 1 && i <= 1936 && (i - 1) % 5 == 0); + f9<long, 7> (&b[33], &b[1967]); + check (i >= 33 && i <= 1967 && (i - 33) % 7 == 0); + f10<long, -7> (&b[1939], &b[17]); + check (i >= 21 && i <= 1939 && (i - 21) % 7 == 0); + f11<I<long> > (&b[16], &b[1981]); + check (i >= 16 && i <= 1984 && (i - 16) % 3 == 0); + f12<I<long> > (&b[1761], &b[37]); + check (i > 37 && i <= 1761); + K<5>::f13<I<long> > (&b[1], &b[1935]); + check (i >= 1 && i <= 1936 && (i - 1) % 5 == 0); +} diff --git a/libgomp/testsuite/libgomp.c++/for-2.C b/libgomp/testsuite/libgomp.c++/for-2.C new file mode 100644 index 00000000000..98ffa1ae6f0 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/for-2.C @@ -0,0 +1,182 @@ +// { dg-do run } + +extern "C" void abort (); + +template <typename T> +class J +{ +public: + J(T x, T y) : b (x), e (y) {} + T begin (); + T end (); +private: + T b, e; +}; + +template <typename T> T J<T>::begin () { return b; } +template <typename T> T J<T>::end () { return e; } + +int results[2000]; + +void +baz (int i) +{ + if (i < 0 || i >= 2000) + abort (); + results[i]++; +} + +void +f1 (int x, int y) +{ +#pragma omp parallel for + for (int i = x; i <= y; i += 6) + baz (i); +} + +void +f2 (int x, int y) +{ + int i; +#pragma omp parallel for private(i) + for (i = x; i < y - 1; i = 1 - 6 + 7 + i) + baz (i); +} + +template <typename T> +void +f3 (int x, int y) +{ +#pragma omp parallel for + for (int i = x; i <= y; i = i + 9 - 8) + baz (i); +} + +template <typename T> +void +f4 (int x, int y) +{ + int i; +#pragma omp parallel for lastprivate(i) + for (i = x + 2000 - 64; i > y + 10; --i) + baz (i); +} + +void +f5 (int x, int y) +{ +#pragma omp parallel for + for (int i = x + 2000 - 64; i > y + 10L; i -= 10L) + baz (i); +} + +template <int N> +void +f6 (int x, int y) +{ +#pragma omp parallel for + for (int i = x + 2000 - 64; i > y + 10L; i = i - 12 + 2L) + baz (i + N); +} + +template <long N> +void +f7 (int i, int x, int y) +{ +#pragma omp parallel for + for (i = x - 10; i <= y + 10; i += N) + baz (i); +} + +template <long N> +void +f8 (J<int> j) +{ + int i; +#pragma omp parallel for + for (i = j.begin (); i <= j.end () + N; i += 2) + baz (i); +} + +template <typename T, long N> +void +f9 (T x, T y) +{ +#pragma omp parallel for + for (T i = x; i <= y; i = i + N) + baz (i); +} + +template <typename T, long N> +void +f10 (T x, T y) +{ + T i; +#pragma omp parallel for + for (i = x; i > y; i = i + N) + baz (i); +} + +template <typename T> +void +f11 (T x, long y) +{ +#pragma omp parallel + { +#pragma omp for nowait + for (T i = x; i <= y; i += 3L) + baz (i); +#pragma omp single + baz (y + 3); + } +} + +template <typename T> +void +f12 (T x, T y) +{ + T i; +#pragma omp parallel for + for (i = x; i > y; --i) + baz (i); +} + +#define check(expr) \ + for (int i = 0; i < 2000; i++) \ + if (expr) \ + { \ + if (results[i] != 1) \ + abort (); \ + results[i] = 0; \ + } \ + else if (results[i]) \ + abort () + +int +main () +{ + f1 (10, 1990); + check (i >= 10 && i <= 1990 && (i - 10) % 6 == 0); + f2 (0, 1999); + check (i < 1998 && (i & 1) == 0); + f3<char> (20, 1837); + check (i >= 20 && i <= 1837); + f4<int> (0, 30); + check (i > 40 && i <= 2000 - 64); + f5 (0, 100); + check (i >= 116 && i <= 2000 - 64 && (i - 116) % 10 == 0); + f6<-10> (10, 110); + check (i >= 116 && i <= 2000 - 64 && (i - 116) % 10 == 0); + f7<6> (0, 12, 1800); + check (i >= 2 && i <= 1808 && (i - 2) % 6 == 0); + f8<121> (J<int> (14, 1803)); + check (i >= 14 && i <= 1924 && (i & 1) == 0); + f9<int, 7> (33, 1967); + check (i >= 33 && i <= 1967 && (i - 33) % 7 == 0); + f10<int, -7> (1939, 17); + check (i >= 21 && i <= 1939 && (i - 21) % 7 == 0); + f11<int> (16, 1981); + check (i >= 16 && i <= 1984 && (i - 16) % 3 == 0); + f12<int> (1761, 37); + check (i > 37 && i <= 1761); +} diff --git a/libgomp/testsuite/libgomp.c++/for-3.C b/libgomp/testsuite/libgomp.c++/for-3.C new file mode 100644 index 00000000000..235f83875ea --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/for-3.C @@ -0,0 +1,239 @@ +// { dg-do run } + +#include <vector> +#include <cstdlib> + +template <typename T> +class J +{ +public: + typedef typename std::vector<T>::const_iterator const_iterator; + J(const const_iterator &x, const const_iterator &y) : b (x), e (y) {} + const const_iterator &begin (); + const const_iterator &end (); +private: + const_iterator b, e; +}; + +template <typename T> +const typename std::vector<T>::const_iterator &J<T>::begin () { return b; } +template <typename T> +const typename std::vector<T>::const_iterator &J<T>::end () { return e; } + +int results[2000]; + +template <typename T> +void +baz (T &i) +{ + if (*i < 0 || *i >= 2000) + std::abort (); + results[*i]++; +} + +void +f1 (const std::vector<int>::const_iterator &x, + const std::vector<int>::const_iterator &y) +{ +#pragma omp parallel for + for (std::vector<int>::const_iterator i = x; i <= y; i += 6) + baz (i); +} + +void +f2 (const std::vector<int>::const_iterator &x, + const std::vector<int>::const_iterator &y) +{ + std::vector<int>::const_iterator i; +#pragma omp parallel for private(i) + for (i = x; i < y - 1; i = 1 - 6 + 7 + i) + baz (i); +} + +template <typename T> +void +f3 (const std::vector<int>::const_iterator &x, + const std::vector<int>::const_iterator &y) +{ +#pragma omp parallel for schedule (dynamic, 6) + for (std::vector<int>::const_iterator i = x; i <= y; i = i + 9 - 8) + baz (i); +} + +template <typename T> +void +f4 (const std::vector<int>::const_iterator &x, + const std::vector<int>::const_iterator &y) +{ + std::vector<int>::const_iterator i; +#pragma omp parallel for lastprivate(i) + for (i = x + 2000 - 64; i > y + 10; --i) + baz (i); +} + +void +f5 (const std::vector<int>::const_iterator &x, + const std::vector<int>::const_iterator &y) +{ +#pragma omp parallel for schedule (static, 10) + for (std::vector<int>::const_iterator i = x + 2000 - 64; i > y + 10; i -= 10) + baz (i); +} + +template <int N> +void +f6 (const std::vector<int>::const_iterator &x, + const std::vector<int>::const_iterator &y) +{ +#pragma omp parallel for schedule (runtime) + for (std::vector<int>::const_iterator i = x + 2000 - 64; + i > y + 10; i = i - 12 + 2) + { + std::vector<int>::const_iterator j = i + N; + baz (j); + } +} + +template <int N> +void +f7 (std::vector<int>::const_iterator i, + const std::vector<int>::const_iterator &x, + const std::vector<int>::const_iterator &y) +{ +#pragma omp parallel for schedule (dynamic, 6) + for (i = x - 10; i <= y + 10; i += N) + baz (i); +} + +template <int N> +void +f8 (J<int> j) +{ + std::vector<int>::const_iterator i; +#pragma omp parallel for schedule (dynamic, 40) + for (i = j.begin (); i <= j.end () + N; i += 2) + baz (i); +} + +template <typename T, int N> +void +f9 (const typename std::vector<T>::const_iterator &x, + const typename std::vector<T>::const_iterator &y) +{ +#pragma omp parallel for schedule (static, 25) + for (typename std::vector<T>::const_iterator i = x; i <= y; i = i + N) + baz (i); +} + +template <typename T, int N> +void +f10 (const typename std::vector<T>::const_iterator &x, + const typename std::vector<T>::const_iterator &y) +{ + typename std::vector<T>::const_iterator i; +#pragma omp parallel for + for (i = x; i > y; i = i + N) + baz (i); +} + +template <typename T> +void +f11 (const T &x, const T &y) +{ +#pragma omp parallel + { +#pragma omp for nowait schedule (static, 2) + for (T i = x; i <= y; i += 3) + baz (i); +#pragma omp single + { + T j = y + 3; + baz (j); + } + } +} + +template <typename T> +void +f12 (const T &x, const T &y) +{ + T i; +#pragma omp parallel for schedule (dynamic, 130) + for (i = x; i > y; --i) + baz (i); +} + +template <int N> +struct K +{ + template <typename T> + static void + f13 (const T &x, const T &y) + { +#pragma omp parallel for schedule (runtime) + for (T i = x; i <= y + N; i += N) + baz (i); + } +}; + +#define check(expr) \ + for (int i = 0; i < 2000; i++) \ + if (expr) \ + { \ + if (results[i] != 1) \ + std::abort (); \ + results[i] = 0; \ + } \ + else if (results[i]) \ + std::abort () + +int +main () +{ + std::vector<int> a(2000); + std::vector<long> b(2000); + for (int i = 0; i < 2000; i++) + { + a[i] = i; + b[i] = i; + } + f1 (a.begin () + 10, a.begin () + 1990); + check (i >= 10 && i <= 1990 && (i - 10) % 6 == 0); + f2 (a.begin () + 0, a.begin () + 1999); + check (i < 1998 && (i & 1) == 0); + f3<char> (a.begin () + 20, a.begin () + 1837); + check (i >= 20 && i <= 1837); + f4<int> (a.begin () + 0, a.begin () + 30); + check (i > 40 && i <= 2000 - 64); + f5 (a.begin () + 0, a.begin () + 100); + check (i >= 116 && i <= 2000 - 64 && (i - 116) % 10 == 0); + f6<-10> (a.begin () + 10, a.begin () + 110); + check (i >= 116 && i <= 2000 - 64 && (i - 116) % 10 == 0); + f7<6> (std::vector<int>::const_iterator (), a.begin () + 12, + a.begin () + 1800); + check (i >= 2 && i <= 1808 && (i - 2) % 6 == 0); + f8<121> (J<int> (a.begin () + 14, a.begin () + 1803)); + check (i >= 14 && i <= 1924 && (i & 1) == 0); + f9<int, 7> (a.begin () + 33, a.begin () + 1967); + check (i >= 33 && i <= 1967 && (i - 33) % 7 == 0); + f10<int, -7> (a.begin () + 1939, a.begin () + 17); + check (i >= 21 && i <= 1939 && (i - 21) % 7 == 0); + f11<std::vector<int>::const_iterator > (a.begin () + 16, a.begin () + 1981); + check (i >= 16 && i <= 1984 && (i - 16) % 3 == 0); + f12<std::vector<int>::const_iterator > (a.begin () + 1761, a.begin () + 37); + check (i > 37 && i <= 1761); + K<5>::f13<std::vector<int>::const_iterator > (a.begin () + 1, + a.begin () + 1935); + check (i >= 1 && i <= 1936 && (i - 1) % 5 == 0); + f9<long, 7> (b.begin () + 33, b.begin () + 1967); + check (i >= 33 && i <= 1967 && (i - 33) % 7 == 0); + f10<long, -7> (b.begin () + 1939, b.begin () + 17); + check (i >= 21 && i <= 1939 && (i - 21) % 7 == 0); + f11<std::vector<long>::const_iterator > (b.begin () + 16, b.begin () + 1981); + check (i >= 16 && i <= 1984 && (i - 16) % 3 == 0); + f12<std::vector<long>::const_iterator > (b.begin () + 1761, b.begin () + 37); + check (i > 37 && i <= 1761); + K<5>::f13<std::vector<long>::const_iterator > (b.begin () + 1, + b.begin () + 1935); + check (i >= 1 && i <= 1936 && (i - 1) % 5 == 0); +} diff --git a/libgomp/testsuite/libgomp.c++/for-4.C b/libgomp/testsuite/libgomp.c++/for-4.C new file mode 100644 index 00000000000..c528ef9d1fa --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/for-4.C @@ -0,0 +1,225 @@ +// { dg-do run } + +#include <string> +#include <cstdlib> + +template <typename T> +class J +{ +public: + typedef typename std::basic_string<T>::iterator iterator; + J(const iterator &x, const iterator &y) : b (x), e (y) {} + const iterator &begin (); + const iterator &end (); +private: + iterator b, e; +}; + +template <typename T> +const typename std::basic_string<T>::iterator &J<T>::begin () { return b; } +template <typename T> +const typename std::basic_string<T>::iterator &J<T>::end () { return e; } + +template <typename T> +void +baz (T &i) +{ + if (*i < L'a' || *i >= L'a' + 2000) + std::abort (); + (*i)++; +} + +void +f1 (const std::basic_string<wchar_t>::iterator &x, + const std::basic_string<wchar_t>::iterator &y) +{ +#pragma omp parallel for + for (std::basic_string<wchar_t>::iterator i = x; i <= y; i += 6) + baz (i); +} + +void +f2 (const std::basic_string<wchar_t>::iterator &x, + const std::basic_string<wchar_t>::iterator &y) +{ + std::basic_string<wchar_t>::iterator i; +#pragma omp parallel for private(i) + for (i = x; i < y - 1; i = 1 - 6 + 7 + i) + baz (i); +} + +template <typename T> +void +f3 (const std::basic_string<wchar_t>::iterator &x, + const std::basic_string<wchar_t>::iterator &y) +{ +#pragma omp parallel for schedule (dynamic, 6) + for (std::basic_string<wchar_t>::iterator i = x; i <= y; i = i + 9 - 8) + baz (i); +} + +template <typename T> +void +f4 (const std::basic_string<wchar_t>::iterator &x, + const std::basic_string<wchar_t>::iterator &y) +{ + std::basic_string<wchar_t>::iterator i; +#pragma omp parallel for lastprivate(i) + for (i = x + 2000 - 64; i > y + 10; --i) + baz (i); +} + +void +f5 (const std::basic_string<wchar_t>::iterator &x, + const std::basic_string<wchar_t>::iterator &y) +{ +#pragma omp parallel for schedule (static, 10) + for (std::basic_string<wchar_t>::iterator i = x + 2000 - 64; + i > y + 10; i -= 10) + baz (i); +} + +template <int N> +void +f6 (const std::basic_string<wchar_t>::iterator &x, + const std::basic_string<wchar_t>::iterator &y) +{ +#pragma omp parallel for schedule (runtime) + for (std::basic_string<wchar_t>::iterator i = x + 2000 - 64; + i > y + 10; i = i - 12 + 2) + { + std::basic_string<wchar_t>::iterator j = i + N; + baz (j); + } +} + +template <int N> +void +f7 (std::basic_string<wchar_t>::iterator i, + const std::basic_string<wchar_t>::iterator &x, + const std::basic_string<wchar_t>::iterator &y) +{ +#pragma omp parallel for schedule (dynamic, 6) + for (i = x - 10; i <= y + 10; i += N) + baz (i); +} + +template <wchar_t N> +void +f8 (J<wchar_t> j) +{ + std::basic_string<wchar_t>::iterator i; +#pragma omp parallel for schedule (dynamic, 40) + for (i = j.begin (); i <= j.end () + N; i += 2) + baz (i); +} + +template <typename T, int N> +void +f9 (const typename std::basic_string<T>::iterator &x, + const typename std::basic_string<T>::iterator &y) +{ +#pragma omp parallel for schedule (static, 25) + for (typename std::basic_string<T>::iterator i = x; i <= y; i = i + N) + baz (i); +} + +template <typename T, int N> +void +f10 (const typename std::basic_string<T>::iterator &x, + const typename std::basic_string<T>::iterator &y) +{ + typename std::basic_string<T>::iterator i; +#pragma omp parallel for + for (i = x; i > y; i = i + N) + baz (i); +} + +template <typename T> +void +f11 (const T &x, const T &y) +{ +#pragma omp parallel + { +#pragma omp for nowait schedule (static, 2) + for (T i = x; i <= y; i += 3) + baz (i); +#pragma omp single + { + T j = y + 3; + baz (j); + } + } +} + +template <typename T> +void +f12 (const T &x, const T &y) +{ + T i; +#pragma omp parallel for schedule (dynamic, 130) + for (i = x; i > y; --i) + baz (i); +} + +template <int N> +struct K +{ + template <typename T> + static void + f13 (const T &x, const T &y) + { +#pragma omp parallel for schedule (runtime) + for (T i = x; i <= y + N; i += N) + baz (i); + } +}; + +#define check(expr) \ + for (int i = 0; i < 2000; i++) \ + if (expr) \ + { \ + if (a[i] != L'a' + i + 1) \ + std::abort (); \ + a[i] = L'a' + i; \ + } \ + else if (a[i] != L'a' + i) \ + std::abort () + +int +main () +{ + std::basic_string<wchar_t> a = L""; + for (int i = 0; i < 2000; i++) + a += L'a' + i; + f1 (a.begin () + 10, a.begin () + 1990); + check (i >= 10 && i <= 1990 && (i - 10) % 6 == 0); + f2 (a.begin () + 0, a.begin () + 1999); + check (i < 1998 && (i & 1) == 0); + f3<char> (a.begin () + 20, a.begin () + 1837); + check (i >= 20 && i <= 1837); + f4<int> (a.begin () + 0, a.begin () + 30); + check (i > 40 && i <= 2000 - 64); + f5 (a.begin () + 0, a.begin () + 100); + check (i >= 116 && i <= 2000 - 64 && (i - 116) % 10 == 0); + f6<-10> (a.begin () + 10, a.begin () + 110); + check (i >= 116 && i <= 2000 - 64 && (i - 116) % 10 == 0); + f7<6> (std::basic_string<wchar_t>::iterator (), a.begin () + 12, + a.begin () + 1800); + check (i >= 2 && i <= 1808 && (i - 2) % 6 == 0); + f8<121> (J<wchar_t> (a.begin () + 14, a.begin () + 1803)); + check (i >= 14 && i <= 1924 && (i & 1) == 0); + f9<wchar_t, 7> (a.begin () + 33, a.begin () + 1967); + check (i >= 33 && i <= 1967 && (i - 33) % 7 == 0); + f10<wchar_t, -7> (a.begin () + 1939, a.begin () + 17); + check (i >= 21 && i <= 1939 && (i - 21) % 7 == 0); + f11<std::basic_string<wchar_t>::iterator > (a.begin () + 16, + a.begin () + 1981); + check (i >= 16 && i <= 1984 && (i - 16) % 3 == 0); + f12<std::basic_string<wchar_t>::iterator > (a.begin () + 1761, + a.begin () + 37); + check (i > 37 && i <= 1761); + K<5>::f13<std::basic_string<wchar_t>::iterator > (a.begin () + 1, + a.begin () + 1935); + check (i >= 1 && i <= 1936 && (i - 1) % 5 == 0); +} diff --git a/libgomp/testsuite/libgomp.c++/for-5.C b/libgomp/testsuite/libgomp.c++/for-5.C new file mode 100644 index 00000000000..9b75bf379ce --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/for-5.C @@ -0,0 +1,303 @@ +// { dg-do run } + +typedef __PTRDIFF_TYPE__ ptrdiff_t; +extern "C" void abort (); + +template <typename T> +class I +{ +public: + typedef ptrdiff_t difference_type; + I (); + ~I (); + I (T *); + I (const I &); + T &operator * (); + T *operator -> (); + T &operator [] (const difference_type &) const; + I &operator = (const I &); + I &operator ++ (); + I operator ++ (int); + I &operator -- (); + I operator -- (int); + I &operator += (const difference_type &); + I &operator -= (const difference_type &); + I operator + (const difference_type &) const; + I operator - (const difference_type &) const; + template <typename S> friend bool operator == (I<S> &, I<S> &); + template <typename S> friend bool operator == (const I<S> &, const I<S> &); + template <typename S> friend bool operator < (I<S> &, I<S> &); + template <typename S> friend bool operator < (const I<S> &, const I<S> &); + template <typename S> friend bool operator <= (I<S> &, I<S> &); + template <typename S> friend bool operator <= (const I<S> &, const I<S> &); + template <typename S> friend bool operator > (I<S> &, I<S> &); + template <typename S> friend bool operator > (const I<S> &, const I<S> &); + template <typename S> friend bool operator >= (I<S> &, I<S> &); + template <typename S> friend bool operator >= (const I<S> &, const I<S> &); + template <typename S> friend typename I<S>::difference_type operator - (I<S> &, I<S> &); + template <typename S> friend typename I<S>::difference_type operator - (const I<S> &, const I<S> &); + template <typename S> friend I<S> operator + (typename I<S>::difference_type , const I<S> &); +private: + T *p; +}; +template <typename T> I<T>::I () : p (0) {} +template <typename T> I<T>::~I () { p = (T *) 0; } +template <typename T> I<T>::I (T *x) : p (x) {} +template <typename T> I<T>::I (const I &x) : p (x.p) {} +template <typename T> T &I<T>::operator * () { return *p; } +template <typename T> T *I<T>::operator -> () { return p; } +template <typename T> T &I<T>::operator [] (const difference_type &x) const { return p[x]; } +template <typename T> I<T> &I<T>::operator = (const I &x) { p = x.p; return *this; } +template <typename T> I<T> &I<T>::operator ++ () { ++p; return *this; } +template <typename T> I<T> I<T>::operator ++ (int) { return I (p++); } +template <typename T> I<T> &I<T>::operator -- () { --p; return *this; } +template <typename T> I<T> I<T>::operator -- (int) { return I (p--); } +template <typename T> I<T> &I<T>::operator += (const difference_type &x) { p += x; return *this; } +template <typename T> I<T> &I<T>::operator -= (const difference_type &x) { p -= x; return *this; } +template <typename T> I<T> I<T>::operator + (const difference_type &x) const { return I (p + x); } +template <typename T> I<T> I<T>::operator - (const difference_type &x) const { return I (p - x); } +template <typename T> bool operator == (I<T> &x, I<T> &y) { return x.p == y.p; } +template <typename T> bool operator == (const I<T> &x, const I<T> &y) { return x.p == y.p; } +template <typename T> bool operator != (I<T> &x, I<T> &y) { return !(x == y); } +template <typename T> bool operator != (const I<T> &x, const I<T> &y) { return !(x == y); } +template <typename T> bool operator < (I<T> &x, I<T> &y) { return x.p < y.p; } +template <typename T> bool operator < (const I<T> &x, const I<T> &y) { return x.p < y.p; } +template <typename T> bool operator <= (I<T> &x, I<T> &y) { return x.p <= y.p; } +template <typename T> bool operator <= (const I<T> &x, const I<T> &y) { return x.p <= y.p; } +template <typename T> bool operator > (I<T> &x, I<T> &y) { return x.p > y.p; } +template <typename T> bool operator > (const I<T> &x, const I<T> &y) { return x.p > y.p; } +template <typename T> bool operator >= (I<T> &x, I<T> &y) { return x.p >= y.p; } +template <typename T> bool operator >= (const I<T> &x, const I<T> &y) { return x.p >= y.p; } +template <typename T> typename I<T>::difference_type operator - (I<T> &x, I<T> &y) { return x.p - y.p; } +template <typename T> typename I<T>::difference_type operator - (const I<T> &x, const I<T> &y) { return x.p - y.p; } +template <typename T> I<T> operator + (typename I<T>::difference_type x, const I<T> &y) { return I<T> (x + y.p); } + +template <typename T> +class J +{ +public: + J(const I<T> &x, const I<T> &y) : b (x), e (y) {} + const I<T> &begin (); + const I<T> &end (); +private: + I<T> b, e; +}; + +template <typename T> const I<T> &J<T>::begin () { return b; } +template <typename T> const I<T> &J<T>::end () { return e; } + +int results[2000]; + +template <typename T> +void +baz (I<T> &i) +{ + if (*i < 0 || *i >= 2000) + abort (); + results[*i]++; +} + +I<int> +f1 (const I<int> &x, const I<int> &y) +{ + I<int> i; +#pragma omp parallel shared (i) + { + #pragma omp for lastprivate (i) schedule(runtime) + for (i = x; i < y - 1; ++i) + baz (i); + #pragma omp single + i += 3; + } + return I<int> (i); +} + +I<int> +f2 (const I<int> &x, const I<int> &y) +{ + I<int> i; +#pragma omp parallel for lastprivate (i) + for (i = x; i < y - 1; i = 1 - 6 + 7 + i) + baz (i); + return I<int> (i); +} + +template <typename T> +I<int> +f3 (const I<int> &x, const I<int> &y) +{ + I<int> i; +#pragma omp parallel + #pragma omp for lastprivate (i) + for (i = x + 1000 - 64; i <= y - 10; i++) + baz (i); + return i; +} + +template <typename T> +I<int> +f4 (const I<int> &x, const I<int> &y) +{ + I<int> i; +#pragma omp parallel for lastprivate (i) + for (i = x + 2000 - 64; i > y + 10; --i) + baz (i); + return I<int> (i); +} + +template <typename T> +I<int> +f5 (const I<int> &x, const I<int> &y) +{ + I<int> i; +#pragma omp parallel for lastprivate (i) + for (i = x; i > y + T (6); i--) + baz (i); + return i; +} + +template <typename T> +I<int> +f6 (const I<int> &x, const I<int> &y) +{ + I<int> i; +#pragma omp parallel for lastprivate (i) + for (i = x - T (7); i > y; i -= T (2)) + baz (i); + return I<int> (i); +} + +template <int N> +I<int> +f7 (I<int> i, const I<int> &x, const I<int> &y) +{ +#pragma omp parallel for lastprivate (i) + for (i = x - 10; i <= y + 10; i += N) + baz (i); + return I<int> (i); +} + +template <int N> +I<int> +f8 (J<int> j) +{ + I<int> i; +#pragma omp parallel shared (i) + #pragma omp for lastprivate (i) + for (i = j.begin (); i <= j.end () + N; i += 2) + baz (i); + return i; +} + +I<int> i9; + +template <long N> +I<int> & +f9 (J<int> j) +{ +#pragma omp parallel for lastprivate (i9) + for (i9 = j.begin () + N; i9 <= j.end () - N; i9 = i9 - N) + baz (i9); + return i9; +} + +template <typename T, int N> +I<T> +f10 (const I<T> &x, const I<T> &y) +{ + I<T> i; +#pragma omp parallel for lastprivate (i) + for (i = x; i > y; i = i + N) + baz (i); + return i; +} + +template <typename T, typename U> +T +f11 (T i, const T &x, const T &y) +{ +#pragma omp parallel + #pragma omp for lastprivate (i) + for (i = x + U (2); i <= y + U (1); i = U (2) + U (3) + i) + baz (i); + return T (i); +} + +template <typename T> +T +f12 (const T &x, const T &y) +{ + T i; +#pragma omp parallel for lastprivate (i) + for (i = x; i > y; --i) + baz (i); + return i; +} + +#define check(expr) \ + for (int i = 0; i < 2000; i++) \ + if (expr) \ + { \ + if (results[i] != 1) \ + abort (); \ + results[i] = 0; \ + } \ + else if (results[i]) \ + abort () + +int +main () +{ + int a[2000]; + long b[2000]; + for (int i = 0; i < 2000; i++) + { + a[i] = i; + b[i] = i; + } + if (*f1 (&a[10], &a[1873]) != 1875) + abort (); + check (i >= 10 && i < 1872); + if (*f2 (&a[0], &a[1998]) != 1998) + abort (); + check (i < 1997 && (i & 1) == 0); + if (*f3<int> (&a[10], &a[1971]) != 1962) + abort (); + check (i >= 946 && i <= 1961); + if (*f4<int> (&a[0], &a[30]) != 40) + abort (); + check (i > 40 && i <= 2000 - 64); + if (*f5<short> (&a[1931], &a[17]) != 23) + abort (); + check (i > 23 && i <= 1931); + if (*f6<long> (&a[1931], &a[17]) != 16) + abort (); + check (i > 17 && i <= 1924 && (i & 1) == 0); + if (*f7<6> (I<int> (), &a[12], &a[1800]) != 1814) + abort (); + check (i >= 2 && i <= 1808 && (i - 2) % 6 == 0); + if (*f8<121> (J<int> (&a[14], &a[1803])) != 1926) + abort (); + check (i >= 14 && i <= 1924 && (i & 1) == 0); + if (*f9<-3L> (J<int> (&a[27], &a[1761])) != 1767) + abort (); + check (i >= 24 && i <= 1764 && (i % 3) == 0); + if (*f10<int, -7> (&a[1939], &a[17]) != 14) + abort (); + check (i >= 21 && i <= 1939 && i % 7 == 0); + if (*f11<I<int>, short> (I<int> (), &a[71], &a[1941]) != 1943) + abort (); + check (i >= 73 && i <= 1938 && (i - 73) % 5 == 0); + if (*f12<I<int> > (&a[1761], &a[37]) != 37) + abort (); + check (i > 37 && i <= 1761); + if (*f10<long, -7> (&b[1939], &b[17]) != 14) + abort (); + check (i >= 21 && i <= 1939 && i % 7 == 0); + if (*f11<I<long>, short> (I<long> (), &b[71], &b[1941]) != 1943) + abort (); + check (i >= 73 && i <= 1938 && (i - 73) % 5 == 0); + if (*f12<I<long> > (&b[1761], &b[37]) != 37) + abort (); + check (i > 37 && i <= 1761); +} diff --git a/libgomp/testsuite/libgomp.c++/loop-10.C b/libgomp/testsuite/libgomp.c++/loop-10.C new file mode 100644 index 00000000000..9c0de25d56f --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-10.C @@ -0,0 +1,105 @@ +// { dg-do run } + +#include <omp.h> + +extern "C" void abort (void); + +#define LLONG_MAX __LONG_LONG_MAX__ +#define ULLONG_MAX (LLONG_MAX * 2ULL + 1) +#define INT_MAX __INT_MAX__ + +int v; + +int +test1 (void) +{ + int e = 0, cnt = 0; + long long i; + unsigned long long j; + char buf[6], *p; + + #pragma omp for schedule(dynamic,1) collapse(2) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + if ((i != LLONG_MAX - 30001 + && i != LLONG_MAX - 20001 + && i != LLONG_MAX - 10001) + || j != 20) + e = 1; + else + cnt++; + if (e || cnt != 3) + abort (); + else + cnt = 0; + + #pragma omp for schedule(guided,1) collapse(2) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + if ((i != -LLONG_MAX + 30000 + && i != -LLONG_MAX + 20000 + && i != -LLONG_MAX + 10000) + || j != ULLONG_MAX - 3) + e = 1; + else + cnt++; + if (e || cnt != 3) + abort (); + else + cnt = 0; + + #pragma omp for schedule(static,1) collapse(2) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + for (j = 20; j <= LLONG_MAX - 70 + v; j += LLONG_MAX + 50ULL) + if ((i != LLONG_MAX - 30001 + && i != LLONG_MAX - 20001 + && i != LLONG_MAX - 10001) + || j != 20) + e = 1; + else + cnt++; + if (e || cnt != 3) + abort (); + else + cnt = 0; + + #pragma omp for schedule(static) collapse(2) nowait + for (i = -LLONG_MAX + 30000 + v; i >= -LLONG_MAX + 10000; i -= 10000) + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + if ((i != -LLONG_MAX + 30000 + && i != -LLONG_MAX + 20000 + && i != -LLONG_MAX + 10000) + || j != ULLONG_MAX - 3) + e = 1; + else + cnt++; + if (e || cnt != 3) + abort (); + else + cnt = 0; + + #pragma omp for schedule(runtime) collapse(2) nowait + for (i = 10; i < 30; i++) + for (p = buf; p <= buf + 4; p += 2) + if (i < 10 || i >= 30 || (p != buf && p != buf + 2 && p != buf + 4)) + e = 1; + else + cnt++; + if (e || cnt != 60) + abort (); + else + cnt = 0; + + return 0; +} + +int +main (void) +{ + if (2 * sizeof (int) != sizeof (long long)) + return 0; + asm volatile ("" : "+r" (v)); + omp_set_schedule (omp_sched_dynamic, 1); + test1 (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/loop-8.C b/libgomp/testsuite/libgomp.c++/loop-8.C new file mode 100644 index 00000000000..bc20c68a167 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-8.C @@ -0,0 +1,276 @@ +#include <omp.h> +#include <stdlib.h> +#include <string.h> + +int +test1 () +{ + short int buf[64], *p; + int i; + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[10]; p < &buf[54]; p++) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[3]; p <= &buf[63]; p += 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[16]; p < &buf[51]; p = 4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[16]; p <= &buf[40]; p = p + 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[53]; p > &buf[9]; --p) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[63]; p >= &buf[3]; p -= 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[48]; p > &buf[15]; p = -4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[40]; p >= &buf[16]; p = p - 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + return 0; +} + +int +test2 () +{ + int buf[64], *p; + int i; + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[10]; p < &buf[54]; p++) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[3]; p <= &buf[63]; p += 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[16]; p < &buf[51]; p = 4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[16]; p <= &buf[40]; p = p + 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[53]; p > &buf[9]; --p) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[63]; p >= &buf[3]; p -= 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[48]; p > &buf[15]; p = -4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[40]; p >= &buf[16]; p = p - 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + return 0; +} + +int +test3 () +{ + int buf[64], *p; + int i; + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[10]; p < &buf[54]; p++) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[3]; p <= &buf[63]; p += 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[16]; p < &buf[51]; p = 4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[16]; p <= &buf[40]; p = p + 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[53]; p > &buf[9]; --p) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[63]; p >= &buf[3]; p -= 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[48]; p > &buf[15]; p = -4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[40]; p >= &buf[16]; p = p - 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + return 0; +} + +int +test4 () +{ + int buf[64], *p; + int i; + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[10]; p < &buf[54]; p++) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[3]; p <= &buf[63]; p += 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[16]; p < &buf[51]; p = 4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[16]; p <= &buf[40]; p = p + 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[53]; p > &buf[9]; --p) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[63]; p >= &buf[3]; p -= 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[48]; p > &buf[15]; p = -4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[40]; p >= &buf[16]; p = p - 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + return 0; +} + +int +main () +{ + test1 (); + test2 (); + test3 (); + omp_set_schedule (omp_sched_static, 0); + test4 (); + omp_set_schedule (omp_sched_static, 3); + test4 (); + omp_set_schedule (omp_sched_dynamic, 5); + test4 (); + omp_set_schedule (omp_sched_guided, 2); + test4 (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/loop-9.C b/libgomp/testsuite/libgomp.c++/loop-9.C new file mode 100644 index 00000000000..35daf2276e8 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-9.C @@ -0,0 +1,387 @@ +// { dg-do run } + +#include <omp.h> + +extern "C" void abort (); + +#define LLONG_MAX __LONG_LONG_MAX__ +#define ULLONG_MAX (LLONG_MAX * 2ULL + 1) +#define INT_MAX __INT_MAX__ + +int arr[6 * 5]; + +void +set (int loopidx, int idx) +{ +#pragma omp atomic + arr[loopidx * 5 + idx]++; +} + +#define check(var, val, loopidx, idx) \ + if (var == (val)) set (loopidx, idx); else +#define test(loopidx, count) \ + for (idx = 0; idx < 5; idx++) \ + if (arr[loopidx * 5 + idx] != idx < count) \ + abort (); \ + else \ + arr[loopidx * 5 + idx] = 0 + +int +test1 () +{ + int e = 0, idx; + +#pragma omp parallel reduction(+:e) + { + long long i; + unsigned long long j; + #pragma omp for schedule(dynamic,1) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + { + check (i, LLONG_MAX - 30001, 0, 0) + check (i, LLONG_MAX - 20001, 0, 1) + check (i, LLONG_MAX - 10001, 0, 2) + e = 1; + } + #pragma omp for schedule(dynamic,1) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + { + check (i, -LLONG_MAX + 30000, 1, 0) + check (i, -LLONG_MAX + 20000, 1, 1) + check (i, -LLONG_MAX + 10000, 1, 2) + e = 1; + } + #pragma omp for schedule(dynamic,1) nowait + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + { + check (j, 20, 2, 0) + e = 1; + } + #pragma omp for schedule(dynamic,1) nowait + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + { + check (j, ULLONG_MAX - 3, 3, 0) + e = 1; + } + #pragma omp for schedule(dynamic,1) nowait + for (j = LLONG_MAX - 20000ULL; j <= LLONG_MAX + 10000ULL; j += 10000ULL) + { + check (j, LLONG_MAX - 20000ULL, 4, 0) + check (j, LLONG_MAX - 10000ULL, 4, 1) + check (j, LLONG_MAX, 4, 2) + check (j, LLONG_MAX + 10000ULL, 4, 3) + e = 1; + } + #pragma omp for schedule(dynamic,1) nowait + for (i = -3LL * INT_MAX - 20000LL; i <= INT_MAX + 10000LL; i += INT_MAX + 200LL) + { + check (i, -3LL * INT_MAX - 20000LL, 5, 0) + check (i, -2LL * INT_MAX - 20000LL + 200LL, 5, 1) + check (i, -INT_MAX - 20000LL + 400LL, 5, 2) + check (i, -20000LL + 600LL, 5, 3) + check (i, INT_MAX - 20000LL + 800LL, 5, 4) + e = 1; + } + } + if (e) + abort (); + test (0, 3); + test (1, 3); + test (2, 1); + test (3, 1); + test (4, 4); + test (5, 5); + return 0; +} + +int +test2 () +{ + int e = 0, idx; + +#pragma omp parallel reduction(+:e) + { + long long i; + unsigned long long j; + #pragma omp for schedule(guided,1) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + { + check (i, LLONG_MAX - 30001, 0, 0) + check (i, LLONG_MAX - 20001, 0, 1) + check (i, LLONG_MAX - 10001, 0, 2) + e = 1; + } + #pragma omp for schedule(guided,1) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + { + check (i, -LLONG_MAX + 30000, 1, 0) + check (i, -LLONG_MAX + 20000, 1, 1) + check (i, -LLONG_MAX + 10000, 1, 2) + e = 1; + } + #pragma omp for schedule(guided,1) nowait + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + { + check (j, 20, 2, 0) + e = 1; + } + #pragma omp for schedule(guided,1) nowait + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + { + check (j, ULLONG_MAX - 3, 3, 0) + e = 1; + } + #pragma omp for schedule(guided,1) nowait + for (j = LLONG_MAX - 20000ULL; j <= LLONG_MAX + 10000ULL; j += 10000ULL) + { + check (j, LLONG_MAX - 20000ULL, 4, 0) + check (j, LLONG_MAX - 10000ULL, 4, 1) + check (j, LLONG_MAX, 4, 2) + check (j, LLONG_MAX + 10000ULL, 4, 3) + e = 1; + } + #pragma omp for schedule(guided,1) nowait + for (i = -3LL * INT_MAX - 20000LL; i <= INT_MAX + 10000LL; i += INT_MAX + 200LL) + { + check (i, -3LL * INT_MAX - 20000LL, 5, 0) + check (i, -2LL * INT_MAX - 20000LL + 200LL, 5, 1) + check (i, -INT_MAX - 20000LL + 400LL, 5, 2) + check (i, -20000LL + 600LL, 5, 3) + check (i, INT_MAX - 20000LL + 800LL, 5, 4) + e = 1; + } + } + if (e) + abort (); + test (0, 3); + test (1, 3); + test (2, 1); + test (3, 1); + test (4, 4); + test (5, 5); + return 0; +} + +int +test3 () +{ + int e = 0, idx; + +#pragma omp parallel reduction(+:e) + { + long long i; + unsigned long long j; + #pragma omp for schedule(static) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + { + check (i, LLONG_MAX - 30001, 0, 0) + check (i, LLONG_MAX - 20001, 0, 1) + check (i, LLONG_MAX - 10001, 0, 2) + e = 1; + } + #pragma omp for schedule(static) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + { + check (i, -LLONG_MAX + 30000, 1, 0) + check (i, -LLONG_MAX + 20000, 1, 1) + check (i, -LLONG_MAX + 10000, 1, 2) + e = 1; + } + #pragma omp for schedule(static) nowait + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + { + check (j, 20, 2, 0) + e = 1; + } + #pragma omp for schedule(static) nowait + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + { + check (j, ULLONG_MAX - 3, 3, 0) + e = 1; + } + #pragma omp for schedule(static) nowait + for (j = LLONG_MAX - 20000ULL; j <= LLONG_MAX + 10000ULL; j += 10000ULL) + { + check (j, LLONG_MAX - 20000ULL, 4, 0) + check (j, LLONG_MAX - 10000ULL, 4, 1) + check (j, LLONG_MAX, 4, 2) + check (j, LLONG_MAX + 10000ULL, 4, 3) + e = 1; + } + #pragma omp for schedule(static) nowait + for (i = -3LL * INT_MAX - 20000LL; i <= INT_MAX + 10000LL; i += INT_MAX + 200LL) + { + check (i, -3LL * INT_MAX - 20000LL, 5, 0) + check (i, -2LL * INT_MAX - 20000LL + 200LL, 5, 1) + check (i, -INT_MAX - 20000LL + 400LL, 5, 2) + check (i, -20000LL + 600LL, 5, 3) + check (i, INT_MAX - 20000LL + 800LL, 5, 4) + e = 1; + } + } + if (e) + abort (); + test (0, 3); + test (1, 3); + test (2, 1); + test (3, 1); + test (4, 4); + test (5, 5); + return 0; +} + +int +test4 () +{ + int e = 0, idx; + +#pragma omp parallel reduction(+:e) + { + long long i; + unsigned long long j; + #pragma omp for schedule(static,1) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + { + check (i, LLONG_MAX - 30001, 0, 0) + check (i, LLONG_MAX - 20001, 0, 1) + check (i, LLONG_MAX - 10001, 0, 2) + e = 1; + } + #pragma omp for schedule(static,1) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + { + check (i, -LLONG_MAX + 30000, 1, 0) + check (i, -LLONG_MAX + 20000, 1, 1) + check (i, -LLONG_MAX + 10000, 1, 2) + e = 1; + } + #pragma omp for schedule(static,1) nowait + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + { + check (j, 20, 2, 0) + e = 1; + } + #pragma omp for schedule(static,1) nowait + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + { + check (j, ULLONG_MAX - 3, 3, 0) + e = 1; + } + #pragma omp for schedule(static,1) nowait + for (j = LLONG_MAX - 20000ULL; j <= LLONG_MAX + 10000ULL; j += 10000ULL) + { + check (j, LLONG_MAX - 20000ULL, 4, 0) + check (j, LLONG_MAX - 10000ULL, 4, 1) + check (j, LLONG_MAX, 4, 2) + check (j, LLONG_MAX + 10000ULL, 4, 3) + e = 1; + } + #pragma omp for schedule(static,1) nowait + for (i = -3LL * INT_MAX - 20000LL; i <= INT_MAX + 10000LL; i += INT_MAX + 200LL) + { + check (i, -3LL * INT_MAX - 20000LL, 5, 0) + check (i, -2LL * INT_MAX - 20000LL + 200LL, 5, 1) + check (i, -INT_MAX - 20000LL + 400LL, 5, 2) + check (i, -20000LL + 600LL, 5, 3) + check (i, INT_MAX - 20000LL + 800LL, 5, 4) + e = 1; + } + } + if (e) + abort (); + test (0, 3); + test (1, 3); + test (2, 1); + test (3, 1); + test (4, 4); + test (5, 5); + return 0; +} + +int +test5 () +{ + int e = 0, idx; + +#pragma omp parallel reduction(+:e) + { + long long i; + unsigned long long j; + #pragma omp for schedule(runtime) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + { + check (i, LLONG_MAX - 30001, 0, 0) + check (i, LLONG_MAX - 20001, 0, 1) + check (i, LLONG_MAX - 10001, 0, 2) + e = 1; + } + #pragma omp for schedule(runtime) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + { + check (i, -LLONG_MAX + 30000, 1, 0) + check (i, -LLONG_MAX + 20000, 1, 1) + check (i, -LLONG_MAX + 10000, 1, 2) + e = 1; + } + #pragma omp for schedule(runtime) nowait + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + { + check (j, 20, 2, 0) + e = 1; + } + #pragma omp for schedule(runtime) nowait + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + { + check (j, ULLONG_MAX - 3, 3, 0) + e = 1; + } + #pragma omp for schedule(runtime) nowait + for (j = LLONG_MAX - 20000ULL; j <= LLONG_MAX + 10000ULL; j += 10000ULL) + { + check (j, LLONG_MAX - 20000ULL, 4, 0) + check (j, LLONG_MAX - 10000ULL, 4, 1) + check (j, LLONG_MAX, 4, 2) + check (j, LLONG_MAX + 10000ULL, 4, 3) + e = 1; + } + #pragma omp for schedule(runtime) nowait + for (i = -3LL * INT_MAX - 20000LL; i <= INT_MAX + 10000LL; i += INT_MAX + 200LL) + { + check (i, -3LL * INT_MAX - 20000LL, 5, 0) + check (i, -2LL * INT_MAX - 20000LL + 200LL, 5, 1) + check (i, -INT_MAX - 20000LL + 400LL, 5, 2) + check (i, -20000LL + 600LL, 5, 3) + check (i, INT_MAX - 20000LL + 800LL, 5, 4) + e = 1; + } + } + if (e) + abort (); + test (0, 3); + test (1, 3); + test (2, 1); + test (3, 1); + test (4, 4); + test (5, 5); + return 0; +} + +int +main () +{ + if (2 * sizeof (int) != sizeof (long long)) + return 0; + test1 (); + test2 (); + test3 (); + test4 (); + omp_set_schedule (omp_sched_static, 0); + test5 (); + omp_set_schedule (omp_sched_static, 3); + test5 (); + omp_set_schedule (omp_sched_dynamic, 5); + test5 (); + omp_set_schedule (omp_sched_guided, 2); + test5 (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/task-1.C b/libgomp/testsuite/libgomp.c++/task-1.C new file mode 100644 index 00000000000..535a8287b0c --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/task-1.C @@ -0,0 +1,83 @@ +extern "C" void abort (); + +int a = 18; + +void +f1 (int i, int j, int k) +{ + int l = 6, m = 7, n = 8; +#pragma omp task private(j, m) shared(k, n) + { + j = 6; + m = 5; + if (++a != 19 || ++i != 9 || j != 6 || ++l != 7 || m != 5 || ++n != 9) + #pragma omp atomic + k++; + } +#pragma omp taskwait + if (a != 19 || i != 8 || j != 26 || k != 0 || l != 6 || m != 7 || n != 9) + abort (); +} + +int v1 = 1, v2 = 2, v5 = 5; +int err; + +void +f2 (void) +{ + int v3 = 3; +#pragma omp sections private (v1) firstprivate (v2) + { + #pragma omp section + { + int v4 = 4; + v1 = 7; + #pragma omp task + { + if (++v1 != 8 || ++v2 != 3 || ++v3 != 4 || ++v4 != 5 || ++v5 != 6) + err = 1; + } + #pragma omp taskwait + if (v1 != 7 || v2 != 2 || v3 != 3 || v4 != 4 || v5 != 6) + abort (); + if (err) + abort (); + } + } +} + +void +f3 (int i, int j, int k) +{ + int l = 6, m = 7, n = 8; +#pragma omp task private(j, m) shared(k, n) untied + { + j = 6; + m = 5; + if (++a != 19 || ++i != 9 || j != 6 || ++l != 7 || m != 5 || ++n != 9) + #pragma omp atomic + k++; + } +#pragma omp taskwait + if (a != 19 || i != 8 || j != 26 || k != 0 || l != 6 || m != 7 || n != 9) + abort (); +} + +int +main () +{ + f1 (8, 26, 0); + f2 (); + a = 18; + f3 (8, 26, 0); + a = 18; +#pragma omp parallel num_threads(4) + { + #pragma omp master + { + f1 (8, 26, 0); + a = 18; + f3 (8, 26, 0); + } + } +} diff --git a/libgomp/testsuite/libgomp.c++/task-2.C b/libgomp/testsuite/libgomp.c++/task-2.C new file mode 100644 index 00000000000..a198cc721b5 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/task-2.C @@ -0,0 +1,70 @@ +// { dg-do run } + +#include <omp.h> +extern "C" void abort (); + +int l = 5; + +int +foo (int i) +{ + int j = 7; + const int k = 8; + #pragma omp task firstprivate (i) shared (j, l) + { + #pragma omp critical + { + j += i; + l += k; + } + } + i++; + #pragma omp task firstprivate (i) shared (j, l) + { + #pragma omp critical + { + j += i; + l += k; + } + } + i++; + #pragma omp task firstprivate (i) shared (j, l) + { + #pragma omp critical + { + j += i; + l += k; + } + } + i++; + #pragma omp task firstprivate (i) shared (j, l) + { + #pragma omp critical + { + j += i; + l += k; + } + } + i++; + #pragma omp taskwait + return (i != 8 * omp_get_thread_num () + 4 + || j != 4 * i - 3 + || k != 8); +} + +int +main (void) +{ + int r = 0; + #pragma omp parallel num_threads (4) reduction(+:r) + if (omp_get_num_threads () != 4) + { + #pragma omp master + l = 133; + } + else if (foo (8 * omp_get_thread_num ())) + r++; + if (r || l != 133) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/task-3.C b/libgomp/testsuite/libgomp.c++/task-3.C new file mode 100644 index 00000000000..e1ecb49654a --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/task-3.C @@ -0,0 +1,90 @@ +// { dg-do run } + +extern "C" void abort (); + +struct A +{ + A (); + ~A (); + A (const A &); + unsigned long l; +}; + +int e; + +A::A () +{ + l = 17; +} + +A::~A () +{ + if (l > 30) + #pragma omp atomic + e++; +} + +A::A (const A &r) +{ + l = r.l; +} + +void +check (int i, A &a, int j, A &b) +{ + if (i != 6 || a.l != 21 || j != 0 || b.l != 23) + #pragma omp atomic + e++; +} + +A b; +int j; + +void +foo (int i) +{ + A a; + a.l = 21; + #pragma omp task firstprivate (i, a, j, b) + check (i, a, j, b); +} + +void +bar (int i, A a) +{ + a.l = 21; + #pragma omp task firstprivate (i, a, j, b) + check (i, a, j, b); +} + +A +baz () +{ + A a, c; + a.l = 21; + c.l = 23; + #pragma omp task firstprivate (a, c) + check (6, a, 0, c); + return a; +} + +int +main () +{ + b.l = 23; + foo (6); + bar (6, A ()); + baz (); + #pragma omp parallel num_threads (4) + { + #pragma omp single + for (int i = 0; i < 64; i++) + { + foo (6); + bar (6, A ()); + baz (); + } + } + if (e) + abort (); +} diff --git a/libgomp/testsuite/libgomp.c++/task-4.C b/libgomp/testsuite/libgomp.c++/task-4.C new file mode 100644 index 00000000000..f2e786a2fdd --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/task-4.C @@ -0,0 +1,37 @@ +#include <omp.h> +extern "C" void *memset (void *, int, __SIZE_TYPE__); +extern "C" void abort (void); + +int e; + +void +baz (int i, int *p, int j, int *q) +{ + if (p[0] != 1 || p[i] != 3 || q[0] != 2 || q[j] != 4) + #pragma omp atomic + e++; +} + +void +foo (int i, int j) +{ + int p[i + 1]; + int q[j + 1]; + memset (p, 0, sizeof (p)); + memset (q, 0, sizeof (q)); + p[0] = 1; + p[i] = 3; + q[0] = 2; + q[j] = 4; + #pragma omp task firstprivate (p, q) + baz (i, p, j, q); +} + +int +main () +{ + #pragma omp parallel num_threads (4) + foo (5 + omp_get_thread_num (), 7 + omp_get_thread_num ()); + if (e) + abort (); +} diff --git a/libgomp/testsuite/libgomp.c++/task-5.C b/libgomp/testsuite/libgomp.c++/task-5.C new file mode 100644 index 00000000000..c882bfe1517 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/task-5.C @@ -0,0 +1,90 @@ +// { dg-do run } + +extern "C" void abort (); + +struct A +{ + A (); + ~A (); + A (const A &); + unsigned long l; +}; + +int e; + +A::A () +{ + l = 17; +} + +A::~A () +{ + if (l > 130) + #pragma omp atomic + e++; +} + +A::A (const A &r) +{ + l = r.l + 64; +} + +void +check (int i, A &a, int j, A &b) +{ + if (i != 6 || a.l != 21 + 64 || j != 0 || b.l != 23 + 64) + #pragma omp atomic + e++; +} + +A b; +int j; + +void +foo (int i) +{ + A a; + a.l = 21; + #pragma omp task firstprivate (j, b) + check (i, a, j, b); +} + +void +bar (int i, A a) +{ + a.l = 21; + #pragma omp task firstprivate (j, b) + check (i, a, j, b); +} + +A +baz () +{ + A a, c; + a.l = 21; + c.l = 23; + #pragma omp task firstprivate (a, c) + check (6, a, 0, c); + return a; +} + +int +main () +{ + b.l = 23; + foo (6); + bar (6, A ()); + baz (); + #pragma omp parallel num_threads (4) + { + #pragma omp single + for (int i = 0; i < 64; i++) + { + foo (6); + bar (6, A ()); + baz (); + } + } + if (e) + abort (); +} diff --git a/libgomp/testsuite/libgomp.c++/task-6.C b/libgomp/testsuite/libgomp.c++/task-6.C new file mode 100644 index 00000000000..cc9072b9d1c --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/task-6.C @@ -0,0 +1,86 @@ +extern "C" void abort (); + +int a = 18; + +template <typename T> +void +f1 (T i, T j, T k) +{ + T l = 6, m = 7, n = 8; +#pragma omp task private(j, m) shared(k, n) + { + j = 6; + m = 5; + if (++a != 19 || ++i != 9 || j != 6 || ++l != 7 || m != 5 || ++n != 9) + #pragma omp atomic + k++; + } +#pragma omp taskwait + if (a != 19 || i != 8 || j != 26 || k != 0 || l != 6 || m != 7 || n != 9) + abort (); +} + +int v1 = 1, v2 = 2, v5 = 5; +int err; + +template <typename T> +void +f2 (void) +{ + T v3 = 3; +#pragma omp sections private (v1) firstprivate (v2) + { + #pragma omp section + { + T v4 = 4; + v1 = 7; + #pragma omp task + { + if (++v1 != 8 || ++v2 != 3 || ++v3 != 4 || ++v4 != 5 || ++v5 != 6) + err = 1; + } + #pragma omp taskwait + if (v1 != 7 || v2 != 2 || v3 != 3 || v4 != 4 || v5 != 6) + abort (); + if (err) + abort (); + } + } +} + +template <typename T> +void +f3 (T i, T j, T k) +{ + T l = 6, m = 7, n = 8; +#pragma omp task private(j, m) shared(k, n) untied + { + j = 6; + m = 5; + if (++a != 19 || ++i != 9 || j != 6 || ++l != 7 || m != 5 || ++n != 9) + #pragma omp atomic + k++; + } +#pragma omp taskwait + if (a != 19 || i != 8 || j != 26 || k != 0 || l != 6 || m != 7 || n != 9) + abort (); +} + +int +main () +{ + f1 <int> (8, 26, 0); + f2 <int> (); + a = 18; + f3 <int> (8, 26, 0); + a = 18; +#pragma omp parallel num_threads(4) + { + #pragma omp master + { + f1 <int> (8, 26, 0); + a = 18; + f3 <int> (8, 26, 0); + } + } +} diff --git a/libgomp/testsuite/libgomp.c/collapse-1.c b/libgomp/testsuite/libgomp.c/collapse-1.c new file mode 100644 index 00000000000..82becfa7952 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/collapse-1.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> + +int +main (void) +{ + int i, j, k, l = 0; + int a[3][3][3]; + + memset (a, '\0', sizeof (a)); + #pragma omp parallel for collapse(4 - 1) schedule(static, 4) + for (i = 0; i < 2; i++) + for (j = 0; j < 2; j++) + for (k = 0; k < 2; k++) + a[i][j][k] = i + j * 4 + k * 16; + #pragma omp parallel + { + #pragma omp for collapse(2) reduction(|:l) + for (i = 0; i < 2; i++) + for (j = 0; j < 2; j++) + for (k = 0; k < 2; k++) + if (a[i][j][k] != i + j * 4 + k * 16) + l = 1; + } + if (l) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/collapse-2.c b/libgomp/testsuite/libgomp.c/collapse-2.c new file mode 100644 index 00000000000..b5c77d46143 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/collapse-2.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <omp.h> + +int +main (void) +{ + int i, j, k, l = 0, f = 0; + int m1 = 4, m2 = -5, m3 = 17; + + #pragma omp parallel for num_threads (8) collapse(3) \ + schedule(static, 9) reduction(+:l) \ + firstprivate(f) + for (i = -2; i < m1; i++) + for (j = m2; j < -2; j++) + { + for (k = 13; k < m3; k++) + { + if (omp_get_num_threads () == 8 + && ((i + 2) * 12 + (j + 5) * 4 + (k - 13) + != (omp_get_thread_num () * 9 + + f++))) + l++; + } + } + if (l) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/collapse-3.c b/libgomp/testsuite/libgomp.c/collapse-3.c new file mode 100644 index 00000000000..4674f83f4b6 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/collapse-3.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -std=gnu99" } */ + +#include <string.h> +#include <stdlib.h> + +int +main (void) +{ + int i2, l = 0; + int a[3][3][3]; + + memset (a, '\0', sizeof (a)); + #pragma omp parallel for collapse(4 - 1) schedule(static, 4) + for (int i = 0; i < 2; i++) + for (int j = 0; j < 2; j++) + for (int k = 0; k < 2; k++) + a[i][j][k] = i + j * 4 + k * 16; + #pragma omp parallel + { + #pragma omp for collapse(2) reduction(|:l) + for (i2 = 0; i2 < 2; i2++) + for (int j = 0; j < 2; j++) + for (int k = 0; k < 2; k++) + if (a[i2][j][k] != i2 + j * 4 + k * 16) + l = 1; + } + if (l) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/icv-1.c b/libgomp/testsuite/libgomp.c/icv-1.c new file mode 100644 index 00000000000..99708f82306 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/icv-1.c @@ -0,0 +1,33 @@ +#include <omp.h> +#include <stdlib.h> + +int +main (void) +{ + int err = 0; + + omp_set_num_threads (4); + if (omp_get_max_threads () != 4) + abort (); + #pragma omp parallel reduction(|: err) num_threads(1) + { + if (omp_get_max_threads () != 4) + err |= 1; + omp_set_num_threads (6); + #pragma omp task if(0) shared(err) + { + if (omp_get_max_threads () != 6) + err |= 2; + omp_set_num_threads (5); + if (omp_get_max_threads () != 5) + err |= 4; + } + if (omp_get_max_threads () != 6) + err |= 8; + } + if (err) + abort (); + if (omp_get_max_threads () != 4) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/icv-2.c b/libgomp/testsuite/libgomp.c/icv-2.c new file mode 100644 index 00000000000..326f8eb404a --- /dev/null +++ b/libgomp/testsuite/libgomp.c/icv-2.c @@ -0,0 +1,46 @@ +/* { dg-do run { target *-*-linux* } } */ + +#ifndef _GNU_SOURCE +#define _GNU_SOURCE 1 +#endif +#include <pthread.h> +#include <omp.h> +#include <stdio.h> +#include <stdlib.h> + +pthread_barrier_t bar; + +void *tf (void *p) +{ + int l; + if (p) + omp_set_num_threads (3); + pthread_barrier_wait (&bar); + if (!p) + omp_set_num_threads (6); + pthread_barrier_wait (&bar); + omp_set_dynamic (0); + if (omp_get_max_threads () != (p ? 3 : 6)) + abort (); + l = 0; + #pragma omp parallel num_threads (6) reduction (|:l) + { + l |= omp_get_max_threads () != (p ? 3 : 6); + omp_set_num_threads ((p ? 3 : 6) + omp_get_thread_num ()); + l |= omp_get_max_threads () != ((p ? 3 : 6) + omp_get_thread_num ()); + } + if (l) + abort (); + return NULL; +} + +int +main (void) +{ + pthread_t th; + pthread_barrier_init (&bar, NULL, 2); + pthread_create (&th, NULL, tf, NULL); + tf (""); + pthread_join (th, NULL); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/lib-2.c b/libgomp/testsuite/libgomp.c/lib-2.c new file mode 100644 index 00000000000..3a3b3f65517 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/lib-2.c @@ -0,0 +1,25 @@ +#include <stdlib.h> +#include <omp.h> + +int +main (void) +{ + omp_sched_t kind; + int modifier; + + omp_set_schedule (omp_sched_static, 32); + omp_get_schedule (&kind, &modifier); + if (kind != omp_sched_static || modifier != 32) + abort (); + omp_set_schedule (omp_sched_guided, 4); + omp_get_schedule (&kind, &modifier); + if (kind != omp_sched_guided || modifier != 4) + abort (); + if (omp_get_thread_limit () < 0) + abort (); + omp_set_max_active_levels (6); + if (omp_get_max_active_levels () != 6) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/lock-1.c b/libgomp/testsuite/libgomp.c/lock-1.c new file mode 100644 index 00000000000..e09645dbc3f --- /dev/null +++ b/libgomp/testsuite/libgomp.c/lock-1.c @@ -0,0 +1,31 @@ +#include <omp.h> +#include <stdlib.h> + +int +main (void) +{ + int l = 0; + omp_nest_lock_t lock; + omp_init_nest_lock (&lock); + if (omp_test_nest_lock (&lock) != 1) + abort (); + if (omp_test_nest_lock (&lock) != 2) + abort (); +#pragma omp parallel if (0) reduction (+:l) + { + /* In OpenMP 2.5 this was supposed to return 3, + but in OpenMP 3.0 the parallel region has a different + task and omp_*_lock_t are owned by tasks, not by threads. */ + if (omp_test_nest_lock (&lock) != 0) + l++; + } + if (l) + abort (); + if (omp_test_nest_lock (&lock) != 3) + abort (); + omp_unset_nest_lock (&lock); + omp_unset_nest_lock (&lock); + omp_unset_nest_lock (&lock); + omp_destroy_nest_lock (&lock); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/lock-2.c b/libgomp/testsuite/libgomp.c/lock-2.c new file mode 100644 index 00000000000..9009b12fe5d --- /dev/null +++ b/libgomp/testsuite/libgomp.c/lock-2.c @@ -0,0 +1,32 @@ +#include <omp.h> +#include <stdlib.h> + +int +main (void) +{ + int l = 0; + omp_nest_lock_t lock; + omp_init_nest_lock (&lock); +#pragma omp parallel reduction (+:l) num_threads (1) + { + if (omp_test_nest_lock (&lock) != 1) + l++; + if (omp_test_nest_lock (&lock) != 2) + l++; + #pragma omp task if (0) shared (lock, l) + { + if (omp_test_nest_lock (&lock) != 0) + l++; + } + #pragma omp taskwait + if (omp_test_nest_lock (&lock) != 3) + l++; + omp_unset_nest_lock (&lock); + omp_unset_nest_lock (&lock); + omp_unset_nest_lock (&lock); + } + if (l) + abort (); + omp_destroy_nest_lock (&lock); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/lock-3.c b/libgomp/testsuite/libgomp.c/lock-3.c new file mode 100644 index 00000000000..1fc83726d18 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/lock-3.c @@ -0,0 +1,60 @@ +/* { dg-do run { target *-*-linux* } } */ + +#ifndef _GNU_SOURCE +#define _GNU_SOURCE 1 +#endif +#include <pthread.h> +#include <omp.h> +#include <stdio.h> +#include <stdlib.h> + +pthread_barrier_t bar; +omp_nest_lock_t lock; + +void *tf (void *p) +{ + int l; + if (p) + { + if (omp_test_nest_lock (&lock) != 1) + abort (); + if (omp_test_nest_lock (&lock) != 2) + abort (); + } + pthread_barrier_wait (&bar); + if (!p && omp_test_nest_lock (&lock) != 0) + abort (); + pthread_barrier_wait (&bar); + if (p) + { + if (omp_test_nest_lock (&lock) != 3) + abort (); + omp_unset_nest_lock (&lock); + omp_unset_nest_lock (&lock); + omp_unset_nest_lock (&lock); + } + pthread_barrier_wait (&bar); + if (!p) + { + if (omp_test_nest_lock (&lock) != 1) + abort (); + if (omp_test_nest_lock (&lock) != 2) + abort (); + omp_unset_nest_lock (&lock); + omp_unset_nest_lock (&lock); + } + return NULL; +} + +int +main (void) +{ + pthread_t th; + omp_init_nest_lock (&lock); + pthread_barrier_init (&bar, NULL, 2); + pthread_create (&th, NULL, tf, NULL); + tf (""); + pthread_join (th, NULL); + omp_destroy_nest_lock (&lock); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/loop-4.c b/libgomp/testsuite/libgomp.c/loop-4.c new file mode 100644 index 00000000000..bc57c043aad --- /dev/null +++ b/libgomp/testsuite/libgomp.c/loop-4.c @@ -0,0 +1,28 @@ +/* { dg-do run } */ + +extern void abort (void); + +int +main (void) +{ + int e = 0; +#pragma omp parallel num_threads (4) reduction(+:e) + { + long i; + #pragma omp for schedule(dynamic,1) + for (i = __LONG_MAX__ - 30001; i <= __LONG_MAX__ - 10001; i += 10000) + if (i != __LONG_MAX__ - 30001 + && i != __LONG_MAX__ - 20001 + && i != __LONG_MAX__ - 10001) + e = 1; + #pragma omp for schedule(dynamic,1) + for (i = -__LONG_MAX__ + 30000; i >= -__LONG_MAX__ + 10000; i -= 10000) + if (i != -__LONG_MAX__ + 30000 + && i != -__LONG_MAX__ + 20000 + && i != -__LONG_MAX__ + 10000) + e = 1; + } + if (e) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/loop-5.c b/libgomp/testsuite/libgomp.c/loop-5.c new file mode 100644 index 00000000000..3a5c7cf4556 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/loop-5.c @@ -0,0 +1,276 @@ +#include <omp.h> +#include <stdlib.h> +#include <string.h> + +int +test1 (void) +{ + short int buf[64], *p; + int i; + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[10]; p < &buf[54]; p++) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[3]; p <= &buf[63]; p += 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[16]; p < &buf[51]; p = 4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[16]; p <= &buf[40]; p = p + 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[53]; p > &buf[9]; --p) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[63]; p >= &buf[3]; p -= 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[48]; p > &buf[15]; p = -4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for + for (p = &buf[40]; p >= &buf[16]; p = p - 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + return 0; +} + +int +test2 (void) +{ + int buf[64], *p; + int i; + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[10]; p < &buf[54]; p++) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[3]; p <= &buf[63]; p += 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[16]; p < &buf[51]; p = 4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[16]; p <= &buf[40]; p = p + 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[53]; p > &buf[9]; --p) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[63]; p >= &buf[3]; p -= 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[48]; p > &buf[15]; p = -4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (static, 3) + for (p = &buf[40]; p >= &buf[16]; p = p - 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + return 0; +} + +int +test3 (void) +{ + int buf[64], *p; + int i; + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[10]; p < &buf[54]; p++) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[3]; p <= &buf[63]; p += 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[16]; p < &buf[51]; p = 4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[16]; p <= &buf[40]; p = p + 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[53]; p > &buf[9]; --p) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[63]; p >= &buf[3]; p -= 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[48]; p > &buf[15]; p = -4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (dynamic, 3) + for (p = &buf[40]; p >= &buf[16]; p = p - 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + return 0; +} + +int +test4 (void) +{ + int buf[64], *p; + int i; + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[10]; p < &buf[54]; p++) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[3]; p <= &buf[63]; p += 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[16]; p < &buf[51]; p = 4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[16]; p <= &buf[40]; p = p + 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[53]; p > &buf[9]; --p) + *p = 5; + for (i = 0; i < 64; i++) + if (buf[i] != 5 * (i >= 10 && i < 54)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[63]; p >= &buf[3]; p -= 2) + p[-2] = 6; + for (i = 0; i < 64; i++) + if (buf[i] != 6 * ((i & 1) && i <= 61)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[48]; p > &buf[15]; p = -4 + p) + p[2] = 7; + for (i = 0; i < 64; i++) + if (buf[i] != 7 * ((i & 3) == 2 && i >= 18 && i < 53)) + abort (); + memset (buf, '\0', sizeof (buf)); +#pragma omp parallel for schedule (runtime) + for (p = &buf[40]; p >= &buf[16]; p = p - 4ULL) + p[2] = -7; + for (i = 0; i < 64; i++) + if (buf[i] != -7 * ((i & 3) == 2 && i >= 18 && i <= 42)) + abort (); + return 0; +} + +int +main (void) +{ + test1 (); + test2 (); + test3 (); + omp_set_schedule (omp_sched_static, 0); + test4 (); + omp_set_schedule (omp_sched_static, 3); + test4 (); + omp_set_schedule (omp_sched_dynamic, 5); + test4 (); + omp_set_schedule (omp_sched_guided, 2); + test4 (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/loop-6.c b/libgomp/testsuite/libgomp.c/loop-6.c new file mode 100644 index 00000000000..9029e181bd2 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/loop-6.c @@ -0,0 +1,387 @@ +/* { dg-do run } */ + +#include <omp.h> + +extern void abort (void); + +#define LLONG_MAX __LONG_LONG_MAX__ +#define ULLONG_MAX (LLONG_MAX * 2ULL + 1) +#define INT_MAX __INT_MAX__ + +int arr[6 * 5]; + +void +set (int loopidx, int idx) +{ +#pragma omp atomic + arr[loopidx * 5 + idx]++; +} + +#define check(var, val, loopidx, idx) \ + if (var == (val)) set (loopidx, idx); else +#define test(loopidx, count) \ + for (idx = 0; idx < 5; idx++) \ + if (arr[loopidx * 5 + idx] != idx < count) \ + abort (); \ + else \ + arr[loopidx * 5 + idx] = 0 + +int +test1 (void) +{ + int e = 0, idx; + +#pragma omp parallel reduction(+:e) + { + long long i; + unsigned long long j; + #pragma omp for schedule(dynamic,1) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + { + check (i, LLONG_MAX - 30001, 0, 0) + check (i, LLONG_MAX - 20001, 0, 1) + check (i, LLONG_MAX - 10001, 0, 2) + e = 1; + } + #pragma omp for schedule(dynamic,1) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + { + check (i, -LLONG_MAX + 30000, 1, 0) + check (i, -LLONG_MAX + 20000, 1, 1) + check (i, -LLONG_MAX + 10000, 1, 2) + e = 1; + } + #pragma omp for schedule(dynamic,1) nowait + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + { + check (j, 20, 2, 0) + e = 1; + } + #pragma omp for schedule(dynamic,1) nowait + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + { + check (j, ULLONG_MAX - 3, 3, 0) + e = 1; + } + #pragma omp for schedule(dynamic,1) nowait + for (j = LLONG_MAX - 20000ULL; j <= LLONG_MAX + 10000ULL; j += 10000ULL) + { + check (j, LLONG_MAX - 20000ULL, 4, 0) + check (j, LLONG_MAX - 10000ULL, 4, 1) + check (j, LLONG_MAX, 4, 2) + check (j, LLONG_MAX + 10000ULL, 4, 3) + e = 1; + } + #pragma omp for schedule(dynamic,1) nowait + for (i = -3LL * INT_MAX - 20000LL; i <= INT_MAX + 10000LL; i += INT_MAX + 200LL) + { + check (i, -3LL * INT_MAX - 20000LL, 5, 0) + check (i, -2LL * INT_MAX - 20000LL + 200LL, 5, 1) + check (i, -INT_MAX - 20000LL + 400LL, 5, 2) + check (i, -20000LL + 600LL, 5, 3) + check (i, INT_MAX - 20000LL + 800LL, 5, 4) + e = 1; + } + } + if (e) + abort (); + test (0, 3); + test (1, 3); + test (2, 1); + test (3, 1); + test (4, 4); + test (5, 5); + return 0; +} + +int +test2 (void) +{ + int e = 0, idx; + +#pragma omp parallel reduction(+:e) + { + long long i; + unsigned long long j; + #pragma omp for schedule(guided,1) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + { + check (i, LLONG_MAX - 30001, 0, 0) + check (i, LLONG_MAX - 20001, 0, 1) + check (i, LLONG_MAX - 10001, 0, 2) + e = 1; + } + #pragma omp for schedule(guided,1) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + { + check (i, -LLONG_MAX + 30000, 1, 0) + check (i, -LLONG_MAX + 20000, 1, 1) + check (i, -LLONG_MAX + 10000, 1, 2) + e = 1; + } + #pragma omp for schedule(guided,1) nowait + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + { + check (j, 20, 2, 0) + e = 1; + } + #pragma omp for schedule(guided,1) nowait + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + { + check (j, ULLONG_MAX - 3, 3, 0) + e = 1; + } + #pragma omp for schedule(guided,1) nowait + for (j = LLONG_MAX - 20000ULL; j <= LLONG_MAX + 10000ULL; j += 10000ULL) + { + check (j, LLONG_MAX - 20000ULL, 4, 0) + check (j, LLONG_MAX - 10000ULL, 4, 1) + check (j, LLONG_MAX, 4, 2) + check (j, LLONG_MAX + 10000ULL, 4, 3) + e = 1; + } + #pragma omp for schedule(guided,1) nowait + for (i = -3LL * INT_MAX - 20000LL; i <= INT_MAX + 10000LL; i += INT_MAX + 200LL) + { + check (i, -3LL * INT_MAX - 20000LL, 5, 0) + check (i, -2LL * INT_MAX - 20000LL + 200LL, 5, 1) + check (i, -INT_MAX - 20000LL + 400LL, 5, 2) + check (i, -20000LL + 600LL, 5, 3) + check (i, INT_MAX - 20000LL + 800LL, 5, 4) + e = 1; + } + } + if (e) + abort (); + test (0, 3); + test (1, 3); + test (2, 1); + test (3, 1); + test (4, 4); + test (5, 5); + return 0; +} + +int +test3 (void) +{ + int e = 0, idx; + +#pragma omp parallel reduction(+:e) + { + long long i; + unsigned long long j; + #pragma omp for schedule(static) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + { + check (i, LLONG_MAX - 30001, 0, 0) + check (i, LLONG_MAX - 20001, 0, 1) + check (i, LLONG_MAX - 10001, 0, 2) + e = 1; + } + #pragma omp for schedule(static) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + { + check (i, -LLONG_MAX + 30000, 1, 0) + check (i, -LLONG_MAX + 20000, 1, 1) + check (i, -LLONG_MAX + 10000, 1, 2) + e = 1; + } + #pragma omp for schedule(static) nowait + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + { + check (j, 20, 2, 0) + e = 1; + } + #pragma omp for schedule(static) nowait + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + { + check (j, ULLONG_MAX - 3, 3, 0) + e = 1; + } + #pragma omp for schedule(static) nowait + for (j = LLONG_MAX - 20000ULL; j <= LLONG_MAX + 10000ULL; j += 10000ULL) + { + check (j, LLONG_MAX - 20000ULL, 4, 0) + check (j, LLONG_MAX - 10000ULL, 4, 1) + check (j, LLONG_MAX, 4, 2) + check (j, LLONG_MAX + 10000ULL, 4, 3) + e = 1; + } + #pragma omp for schedule(static) nowait + for (i = -3LL * INT_MAX - 20000LL; i <= INT_MAX + 10000LL; i += INT_MAX + 200LL) + { + check (i, -3LL * INT_MAX - 20000LL, 5, 0) + check (i, -2LL * INT_MAX - 20000LL + 200LL, 5, 1) + check (i, -INT_MAX - 20000LL + 400LL, 5, 2) + check (i, -20000LL + 600LL, 5, 3) + check (i, INT_MAX - 20000LL + 800LL, 5, 4) + e = 1; + } + } + if (e) + abort (); + test (0, 3); + test (1, 3); + test (2, 1); + test (3, 1); + test (4, 4); + test (5, 5); + return 0; +} + +int +test4 (void) +{ + int e = 0, idx; + +#pragma omp parallel reduction(+:e) + { + long long i; + unsigned long long j; + #pragma omp for schedule(static,1) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + { + check (i, LLONG_MAX - 30001, 0, 0) + check (i, LLONG_MAX - 20001, 0, 1) + check (i, LLONG_MAX - 10001, 0, 2) + e = 1; + } + #pragma omp for schedule(static,1) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + { + check (i, -LLONG_MAX + 30000, 1, 0) + check (i, -LLONG_MAX + 20000, 1, 1) + check (i, -LLONG_MAX + 10000, 1, 2) + e = 1; + } + #pragma omp for schedule(static,1) nowait + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + { + check (j, 20, 2, 0) + e = 1; + } + #pragma omp for schedule(static,1) nowait + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + { + check (j, ULLONG_MAX - 3, 3, 0) + e = 1; + } + #pragma omp for schedule(static,1) nowait + for (j = LLONG_MAX - 20000ULL; j <= LLONG_MAX + 10000ULL; j += 10000ULL) + { + check (j, LLONG_MAX - 20000ULL, 4, 0) + check (j, LLONG_MAX - 10000ULL, 4, 1) + check (j, LLONG_MAX, 4, 2) + check (j, LLONG_MAX + 10000ULL, 4, 3) + e = 1; + } + #pragma omp for schedule(static,1) nowait + for (i = -3LL * INT_MAX - 20000LL; i <= INT_MAX + 10000LL; i += INT_MAX + 200LL) + { + check (i, -3LL * INT_MAX - 20000LL, 5, 0) + check (i, -2LL * INT_MAX - 20000LL + 200LL, 5, 1) + check (i, -INT_MAX - 20000LL + 400LL, 5, 2) + check (i, -20000LL + 600LL, 5, 3) + check (i, INT_MAX - 20000LL + 800LL, 5, 4) + e = 1; + } + } + if (e) + abort (); + test (0, 3); + test (1, 3); + test (2, 1); + test (3, 1); + test (4, 4); + test (5, 5); + return 0; +} + +int +test5 (void) +{ + int e = 0, idx; + +#pragma omp parallel reduction(+:e) + { + long long i; + unsigned long long j; + #pragma omp for schedule(runtime) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + { + check (i, LLONG_MAX - 30001, 0, 0) + check (i, LLONG_MAX - 20001, 0, 1) + check (i, LLONG_MAX - 10001, 0, 2) + e = 1; + } + #pragma omp for schedule(runtime) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + { + check (i, -LLONG_MAX + 30000, 1, 0) + check (i, -LLONG_MAX + 20000, 1, 1) + check (i, -LLONG_MAX + 10000, 1, 2) + e = 1; + } + #pragma omp for schedule(runtime) nowait + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + { + check (j, 20, 2, 0) + e = 1; + } + #pragma omp for schedule(runtime) nowait + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + { + check (j, ULLONG_MAX - 3, 3, 0) + e = 1; + } + #pragma omp for schedule(runtime) nowait + for (j = LLONG_MAX - 20000ULL; j <= LLONG_MAX + 10000ULL; j += 10000ULL) + { + check (j, LLONG_MAX - 20000ULL, 4, 0) + check (j, LLONG_MAX - 10000ULL, 4, 1) + check (j, LLONG_MAX, 4, 2) + check (j, LLONG_MAX + 10000ULL, 4, 3) + e = 1; + } + #pragma omp for schedule(runtime) nowait + for (i = -3LL * INT_MAX - 20000LL; i <= INT_MAX + 10000LL; i += INT_MAX + 200LL) + { + check (i, -3LL * INT_MAX - 20000LL, 5, 0) + check (i, -2LL * INT_MAX - 20000LL + 200LL, 5, 1) + check (i, -INT_MAX - 20000LL + 400LL, 5, 2) + check (i, -20000LL + 600LL, 5, 3) + check (i, INT_MAX - 20000LL + 800LL, 5, 4) + e = 1; + } + } + if (e) + abort (); + test (0, 3); + test (1, 3); + test (2, 1); + test (3, 1); + test (4, 4); + test (5, 5); + return 0; +} + +int +main (void) +{ + if (2 * sizeof (int) != sizeof (long long)) + return 0; + test1 (); + test2 (); + test3 (); + test4 (); + omp_set_schedule (omp_sched_static, 0); + test5 (); + omp_set_schedule (omp_sched_static, 3); + test5 (); + omp_set_schedule (omp_sched_dynamic, 5); + test5 (); + omp_set_schedule (omp_sched_guided, 2); + test5 (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/loop-7.c b/libgomp/testsuite/libgomp.c/loop-7.c new file mode 100644 index 00000000000..fc97f4a2907 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/loop-7.c @@ -0,0 +1,105 @@ +/* { dg-do run } */ + +#include <omp.h> + +extern void abort (void); + +#define LLONG_MAX __LONG_LONG_MAX__ +#define ULLONG_MAX (LLONG_MAX * 2ULL + 1) +#define INT_MAX __INT_MAX__ + +int v; + +int +test1 (void) +{ + int e = 0, cnt = 0; + long long i; + unsigned long long j; + char buf[6], *p; + + #pragma omp for schedule(dynamic,1) collapse(2) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + for (j = 20; j <= LLONG_MAX - 70; j += LLONG_MAX + 50ULL) + if ((i != LLONG_MAX - 30001 + && i != LLONG_MAX - 20001 + && i != LLONG_MAX - 10001) + || j != 20) + e = 1; + else + cnt++; + if (e || cnt != 3) + abort (); + else + cnt = 0; + + #pragma omp for schedule(guided,1) collapse(2) nowait + for (i = -LLONG_MAX + 30000; i >= -LLONG_MAX + 10000; i -= 10000) + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + if ((i != -LLONG_MAX + 30000 + && i != -LLONG_MAX + 20000 + && i != -LLONG_MAX + 10000) + || j != ULLONG_MAX - 3) + e = 1; + else + cnt++; + if (e || cnt != 3) + abort (); + else + cnt = 0; + + #pragma omp for schedule(static,1) collapse(2) nowait + for (i = LLONG_MAX - 30001; i <= LLONG_MAX - 10001; i += 10000) + for (j = 20; j <= LLONG_MAX - 70 + v; j += LLONG_MAX + 50ULL) + if ((i != LLONG_MAX - 30001 + && i != LLONG_MAX - 20001 + && i != LLONG_MAX - 10001) + || j != 20) + e = 1; + else + cnt++; + if (e || cnt != 3) + abort (); + else + cnt = 0; + + #pragma omp for schedule(static) collapse(2) nowait + for (i = -LLONG_MAX + 30000 + v; i >= -LLONG_MAX + 10000; i -= 10000) + for (j = ULLONG_MAX - 3; j >= LLONG_MAX + 70ULL; j -= LLONG_MAX + 50ULL) + if ((i != -LLONG_MAX + 30000 + && i != -LLONG_MAX + 20000 + && i != -LLONG_MAX + 10000) + || j != ULLONG_MAX - 3) + e = 1; + else + cnt++; + if (e || cnt != 3) + abort (); + else + cnt = 0; + + #pragma omp for schedule(runtime) collapse(2) nowait + for (i = 10; i < 30; i++) + for (p = buf; p <= buf + 4; p += 2) + if (i < 10 || i >= 30 || (p != buf && p != buf + 2 && p != buf + 4)) + e = 1; + else + cnt++; + if (e || cnt != 60) + abort (); + else + cnt = 0; + + return 0; +} + +int +main (void) +{ + if (2 * sizeof (int) != sizeof (long long)) + return 0; + asm volatile ("" : "+r" (v)); + omp_set_schedule (omp_sched_dynamic, 1); + test1 (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/loop-8.c b/libgomp/testsuite/libgomp.c/loop-8.c new file mode 100644 index 00000000000..25db25c3b43 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/loop-8.c @@ -0,0 +1,27 @@ +extern void abort (void); + +int buf[256]; + +void __attribute__((noinline)) +foo (void) +{ + int i; + #pragma omp for schedule (auto) + for (i = 0; i < 256; i++) + buf[i] += i; +} + +int +main (void) +{ + int i; + #pragma omp parallel for schedule (auto) + for (i = 0; i < 256; i++) + buf[i] = i; + #pragma omp parallel num_threads (4) + foo (); + for (i = 0; i < 256; i++) + if (buf[i] != 2 * i) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/loop-9.c b/libgomp/testsuite/libgomp.c/loop-9.c new file mode 100644 index 00000000000..1f789e12ecb --- /dev/null +++ b/libgomp/testsuite/libgomp.c/loop-9.c @@ -0,0 +1,18 @@ +extern void abort (void); + +char buf[8] = "01234567"; +char buf2[8] = "23456789"; + +int +main (void) +{ + char *p, *q; + int sum = 0; + #pragma omp parallel for collapse (2) reduction (+:sum) lastprivate (p, q) + for (p = buf; p < &buf[8]; p++) + for (q = &buf2[0]; q <= buf2 + 7; q++) + sum += (*p - '0') + (*q - '0'); + if (p != &buf[8] || q != buf2 + 8 || sum != 576) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/nested-3.c b/libgomp/testsuite/libgomp.c/nested-3.c new file mode 100644 index 00000000000..618600633ac --- /dev/null +++ b/libgomp/testsuite/libgomp.c/nested-3.c @@ -0,0 +1,89 @@ +#include <omp.h> +#include <stdlib.h> +#include <string.h> + +int +main (void) +{ + int e[3]; + + memset (e, '\0', sizeof (e)); + omp_set_nested (1); + omp_set_dynamic (0); + if (omp_in_parallel () + || omp_get_level () != 0 + || omp_get_ancestor_thread_num (0) != 0 + || omp_get_ancestor_thread_num (-1) != -1 + || omp_get_ancestor_thread_num (1) != -1 + || omp_get_team_size (0) != 1 + || omp_get_team_size (-1) != -1 + || omp_get_team_size (1) != -1 + || omp_get_active_level () != 0) + abort (); +#pragma omp parallel num_threads (4) + { + int tn1 = omp_get_thread_num (); + if (omp_in_parallel () != 1 + || omp_get_num_threads () != 4 + || tn1 >= 4 || tn1 < 0 + || omp_get_level () != 1 + || omp_get_ancestor_thread_num (0) != 0 + || omp_get_ancestor_thread_num (1) != tn1 + || omp_get_ancestor_thread_num (-1) != -1 + || omp_get_ancestor_thread_num (2) != -1 + || omp_get_team_size (0) != 1 + || omp_get_team_size (1) != omp_get_num_threads () + || omp_get_team_size (-1) != -1 + || omp_get_team_size (2) != -1 + || omp_get_active_level () != 1) + #pragma omp atomic + e[0] += 1; + #pragma omp parallel if (0) num_threads(5) firstprivate(tn1) + { + int tn2 = omp_get_thread_num (); + if (omp_in_parallel () != 1 + || omp_get_num_threads () != 1 + || tn2 != 0 + || omp_get_level () != 2 + || omp_get_ancestor_thread_num (0) != 0 + || omp_get_ancestor_thread_num (1) != tn1 + || omp_get_ancestor_thread_num (2) != tn2 + || omp_get_ancestor_thread_num (-1) != -1 + || omp_get_ancestor_thread_num (3) != -1 + || omp_get_team_size (0) != 1 + || omp_get_team_size (1) != 4 + || omp_get_team_size (2) != 1 + || omp_get_team_size (-1) != -1 + || omp_get_team_size (3) != -1 + || omp_get_active_level () != 1) + #pragma omp atomic + e[1] += 1; + #pragma omp parallel num_threads(2) firstprivate(tn1, tn2) + { + int tn3 = omp_get_thread_num (); + if (omp_in_parallel () != 1 + || omp_get_num_threads () != 2 + || tn3 > 1 || tn3 < 0 + || omp_get_level () != 3 + || omp_get_ancestor_thread_num (0) != 0 + || omp_get_ancestor_thread_num (1) != tn1 + || omp_get_ancestor_thread_num (2) != tn2 + || omp_get_ancestor_thread_num (3) != tn3 + || omp_get_ancestor_thread_num (-1) != -1 + || omp_get_ancestor_thread_num (4) != -1 + || omp_get_team_size (0) != 1 + || omp_get_team_size (1) != 4 + || omp_get_team_size (2) != 1 + || omp_get_team_size (3) != 2 + || omp_get_team_size (-1) != -1 + || omp_get_team_size (4) != -1 + || omp_get_active_level () != 2) + #pragma omp atomic + e[2] += 1; + } + } + } + if (e[0] || e[1] || e[2]) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/nestedfn-6.c b/libgomp/testsuite/libgomp.c/nestedfn-6.c new file mode 100644 index 00000000000..c0ace6b3fb8 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/nestedfn-6.c @@ -0,0 +1,21 @@ +extern void abort (void); + +int j; + +int +main (void) +{ + int i; + void nested (void) { i = 0; } +#pragma omp parallel for lastprivate (i) + for (i = 0; i < 50; i += 3) + ; + if (i != 51) + abort (); +#pragma omp parallel for lastprivate (j) + for (j = -50; j < 70; j += 7) + ; + if (j != 76) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/pr26943-2.c b/libgomp/testsuite/libgomp.c/pr26943-2.c index 778048492f6..c052e811288 100644 --- a/libgomp/testsuite/libgomp.c/pr26943-2.c +++ b/libgomp/testsuite/libgomp.c/pr26943-2.c @@ -20,7 +20,7 @@ main (void) { if (a != 8 || b != 12 || e[0] != 'a' || f[0] != 'b') j++; -#pragma omp barrier +#pragma omp barrier /* { dg-warning "may not be closely nested" } */ #pragma omp atomic a += i; b += i; @@ -31,7 +31,7 @@ main (void) f[0] += i; g[0] = 'g' + i; h[0] = 'h' + i; -#pragma omp barrier +#pragma omp barrier /* { dg-warning "may not be closely nested" } */ if (a != 8 + 6 || b != 12 + i || c != i || d != i) j += 8; if (e[0] != 'a' + 6 || f[0] != 'b' + i || g[0] != 'g' + i) diff --git a/libgomp/testsuite/libgomp.c/pr26943-3.c b/libgomp/testsuite/libgomp.c/pr26943-3.c index be93cb479d1..dc3d5010da1 100644 --- a/libgomp/testsuite/libgomp.c/pr26943-3.c +++ b/libgomp/testsuite/libgomp.c/pr26943-3.c @@ -26,7 +26,7 @@ main (void) { if (a != 8 || b != 12 || e[0] != 'a' || f[0] != 'b') j++; -#pragma omp barrier +#pragma omp barrier /* { dg-warning "may not be closely nested" } */ #pragma omp atomic a += i; b += i; @@ -37,7 +37,7 @@ main (void) f[0] += i; g[0] = 'g' + i; h[0] = 'h' + i; -#pragma omp barrier +#pragma omp barrier /* { dg-warning "may not be closely nested" } */ if (a != 8 + 6 || b != 12 + i || c != i || d != i) j += 8; if (e[0] != 'a' + 6 || f[0] != 'b' + i || g[0] != 'g' + i) diff --git a/libgomp/testsuite/libgomp.c/pr26943-4.c b/libgomp/testsuite/libgomp.c/pr26943-4.c index 33d368583dd..0f1d4197a5f 100644 --- a/libgomp/testsuite/libgomp.c/pr26943-4.c +++ b/libgomp/testsuite/libgomp.c/pr26943-4.c @@ -27,7 +27,7 @@ main (void) { if (a != 8 || b != 12 || e[0] != 'a' || f[0] != 'b') j++; -#pragma omp barrier +#pragma omp barrier /* { dg-warning "may not be closely nested" } */ #pragma omp atomic a += i; b += i; @@ -38,7 +38,7 @@ main (void) f[0] += i; g[0] = 'g' + i; h[0] = 'h' + i; -#pragma omp barrier +#pragma omp barrier /* { dg-warning "may not be closely nested" } */ if (a != 8 + 6 || b != 12 + i || c != i || d != i) j += 8; if (e[0] != 'a' + 6 || f[0] != 'b' + i || g[0] != 'g' + i) diff --git a/libgomp/testsuite/libgomp.c/sort-1.c b/libgomp/testsuite/libgomp.c/sort-1.c new file mode 100644 index 00000000000..269d69da12c --- /dev/null +++ b/libgomp/testsuite/libgomp.c/sort-1.c @@ -0,0 +1,379 @@ +/* Test and benchmark of a couple of parallel sorting algorithms. + Copyright (C) 2008 Free Software Foundation, Inc. + + GCC is free software; you can redistribute it and/or modify it under + the terms of the GNU General Public License as published by the Free + Software Foundation; either version 3, or (at your option) any later + version. + + GCC is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + <http://www.gnu.org/licenses/>. */ + +#include <limits.h> +#include <omp.h> +#include <stdbool.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +int failures; + +#define THRESHOLD 100 + +static void +verify (const char *name, double stime, int *array, int count) +{ + int i; + double etime = omp_get_wtime (); + + printf ("%s: %g\n", name, etime - stime); + for (i = 1; i < count; i++) + if (array[i] < array[i - 1]) + { + printf ("%s: incorrectly sorted\n", name); + failures = 1; + } +} + +static void +insertsort (int *array, int s, int e) +{ + int i, j, val; + for (i = s + 1; i <= e; i++) + { + val = array[i]; + j = i; + while (j-- > s && val < array[j]) + array[j + 1] = array[j]; + array[j + 1] = val; + } +} + +struct int_pair +{ + int lo; + int hi; +}; + +struct int_pair_stack +{ + struct int_pair *top; +#define STACK_SIZE 4 * CHAR_BIT * sizeof (int) + struct int_pair arr[STACK_SIZE]; +}; + +static inline void +init_int_pair_stack (struct int_pair_stack *stack) +{ + stack->top = &stack->arr[0]; +} + +static inline void +push_int_pair_stack (struct int_pair_stack *stack, int lo, int hi) +{ + stack->top->lo = lo; + stack->top->hi = hi; + stack->top++; +} + +static inline void +pop_int_pair_stack (struct int_pair_stack *stack, int *lo, int *hi) +{ + stack->top--; + *lo = stack->top->lo; + *hi = stack->top->hi; +} + +static inline int +size_int_pair_stack (struct int_pair_stack *stack) +{ + return stack->top - &stack->arr[0]; +} + +static inline void +busy_wait (void) +{ +#if defined __i386__ || defined __x86_64__ + __asm volatile ("rep; nop" : : : "memory"); +#elif defined __ia64__ + __asm volatile ("hint @pause" : : : "memory"); +#elif defined __sparc__ && (defined __arch64__ || defined __sparc_v9__) + __asm volatile ("membar #LoadLoad" : : : "memory"); +#else + __asm volatile ("" : : : "memory"); +#endif +} + +static inline void +swap (int *array, int a, int b) +{ + int val = array[a]; + array[a] = array[b]; + array[b] = val; +} + +static inline int +choose_pivot (int *array, int lo, int hi) +{ + int mid = (lo + hi) / 2; + + if (array[mid] < array[lo]) + swap (array, lo, mid); + if (array[hi] < array[mid]) + { + swap (array, mid, hi); + if (array[mid] < array[lo]) + swap (array, lo, mid); + } + return array[mid]; +} + +static inline int +partition (int *array, int lo, int hi) +{ + int pivot = choose_pivot (array, lo, hi); + int left = lo; + int right = hi; + + for (;;) + { + while (array[++left] < pivot); + while (array[--right] > pivot); + if (left >= right) + break; + swap (array, left, right); + } + return left; +} + +static void +sort1 (int *array, int count) +{ + omp_lock_t lock; + struct int_pair_stack global_stack; + int busy = 1; + int num_threads; + + omp_init_lock (&lock); + init_int_pair_stack (&global_stack); + #pragma omp parallel firstprivate (array, count) + { + int lo = 0, hi = 0, mid, next_lo, next_hi; + bool idle = true; + struct int_pair_stack local_stack; + + init_int_pair_stack (&local_stack); + if (omp_get_thread_num () == 0) + { + num_threads = omp_get_num_threads (); + hi = count - 1; + idle = false; + } + + for (;;) + { + if (hi - lo < THRESHOLD) + { + insertsort (array, lo, hi); + lo = hi; + } + if (lo >= hi) + { + if (size_int_pair_stack (&local_stack) == 0) + { + again: + omp_set_lock (&lock); + if (size_int_pair_stack (&global_stack) == 0) + { + if (!idle) + busy--; + if (busy == 0) + { + omp_unset_lock (&lock); + break; + } + omp_unset_lock (&lock); + idle = true; + while (size_int_pair_stack (&global_stack) == 0 + && busy) + busy_wait (); + goto again; + } + if (idle) + busy++; + pop_int_pair_stack (&global_stack, &lo, &hi); + omp_unset_lock (&lock); + idle = false; + } + else + pop_int_pair_stack (&local_stack, &lo, &hi); + } + + mid = partition (array, lo, hi); + if (mid - lo < hi - mid) + { + next_lo = mid; + next_hi = hi; + hi = mid - 1; + } + else + { + next_lo = lo; + next_hi = mid - 1; + lo = mid; + } + + if (next_hi - next_lo < THRESHOLD) + insertsort (array, next_lo, next_hi); + else + { + if (size_int_pair_stack (&global_stack) < num_threads - 1) + { + int size; + + omp_set_lock (&lock); + size = size_int_pair_stack (&global_stack); + if (size < num_threads - 1 && size < STACK_SIZE) + push_int_pair_stack (&global_stack, next_lo, next_hi); + else + push_int_pair_stack (&local_stack, next_lo, next_hi); + omp_unset_lock (&lock); + } + else + push_int_pair_stack (&local_stack, next_lo, next_hi); + } + } + } + omp_destroy_lock (&lock); +} + +static void +sort2_1 (int *array, int lo, int hi, int num_threads, int *busy) +{ + int mid; + + if (hi - lo < THRESHOLD) + { + insertsort (array, lo, hi); + return; + } + + mid = partition (array, lo, hi); + + if (*busy >= num_threads) + { + sort2_1 (array, lo, mid - 1, num_threads, busy); + sort2_1 (array, mid, hi, num_threads, busy); + return; + } + + #pragma omp atomic + *busy += 1; + + #pragma omp parallel num_threads (2) \ + firstprivate (array, lo, hi, mid, num_threads, busy) + { + if (omp_get_thread_num () == 0) + sort2_1 (array, lo, mid - 1, num_threads, busy); + else + { + sort2_1 (array, mid, hi, num_threads, busy); + #pragma omp atomic + *busy -= 1; + } + } +} + +static void +sort2 (int *array, int count) +{ + int num_threads; + int busy = 1; + + #pragma omp parallel + #pragma omp single nowait + num_threads = omp_get_num_threads (); + + sort2_1 (array, 0, count - 1, num_threads, &busy); +} + +#if _OPENMP >= 200805 +static void +sort3_1 (int *array, int lo, int hi) +{ + int mid; + + if (hi - lo < THRESHOLD) + { + insertsort (array, lo, hi); + return; + } + + mid = partition (array, lo, hi); + #pragma omp task + sort3_1 (array, lo, mid - 1); + sort3_1 (array, mid, hi); +} + +static void +sort3 (int *array, int count) +{ + #pragma omp parallel + #pragma omp single + sort3_1 (array, 0, count - 1); +} +#endif + +int +main (int argc, char **argv) +{ + int i, count = 1000000; + double stime; + int *unsorted, *sorted, num_threads; + if (argc >= 2) + count = strtoul (argv[1], NULL, 0); + + unsorted = malloc (count * sizeof (int)); + sorted = malloc (count * sizeof (int)); + if (unsorted == NULL || sorted == NULL) + { + puts ("allocation failure"); + exit (1); + } + + srand (0xdeadbeef); + for (i = 0; i < count; i++) + unsorted[i] = rand (); + + omp_set_nested (1); + omp_set_dynamic (0); + #pragma omp parallel + #pragma omp single nowait + num_threads = omp_get_num_threads (); + printf ("Threads: %d\n", num_threads); + + memcpy (sorted, unsorted, count * sizeof (int)); + stime = omp_get_wtime (); + sort1 (sorted, count); + verify ("sort1", stime, sorted, count); + + memcpy (sorted, unsorted, count * sizeof (int)); + stime = omp_get_wtime (); + sort2 (sorted, count); + verify ("sort2", stime, sorted, count); + +#if _OPENMP >= 200805 + memcpy (sorted, unsorted, count * sizeof (int)); + stime = omp_get_wtime (); + sort3 (sorted, count); + verify ("sort3", stime, sorted, count); +#endif + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/task-1.c b/libgomp/testsuite/libgomp.c/task-1.c new file mode 100644 index 00000000000..66f58a29b87 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/task-1.c @@ -0,0 +1,84 @@ +extern void abort (void); + +int a = 18; + +void +f1 (int i, int j, int k) +{ + int l = 6, m = 7, n = 8; +#pragma omp task private(j, m) shared(k, n) + { + j = 6; + m = 5; + if (++a != 19 || ++i != 9 || j != 6 || ++l != 7 || m != 5 || ++n != 9) + #pragma omp atomic + k++; + } +#pragma omp taskwait + if (a != 19 || i != 8 || j != 26 || k != 0 || l != 6 || m != 7 || n != 9) + abort (); +} + +int v1 = 1, v2 = 2, v5 = 5; +int err; + +void +f2 (void) +{ + int v3 = 3; +#pragma omp sections private (v1) firstprivate (v2) + { + #pragma omp section + { + int v4 = 4; + v1 = 7; + #pragma omp task + { + if (++v1 != 8 || ++v2 != 3 || ++v3 != 4 || ++v4 != 5 || ++v5 != 6) + err = 1; + } + #pragma omp taskwait + if (v1 != 7 || v2 != 2 || v3 != 3 || v4 != 4 || v5 != 6) + abort (); + if (err) + abort (); + } + } +} + +void +f3 (int i, int j, int k) +{ + int l = 6, m = 7, n = 8; +#pragma omp task private(j, m) shared(k, n) untied + { + j = 6; + m = 5; + if (++a != 19 || ++i != 9 || j != 6 || ++l != 7 || m != 5 || ++n != 9) + #pragma omp atomic + k++; + } +#pragma omp taskwait + if (a != 19 || i != 8 || j != 26 || k != 0 || l != 6 || m != 7 || n != 9) + abort (); +} + +int +main (void) +{ + f1 (8, 26, 0); + f2 (); + a = 18; + f3 (8, 26, 0); + a = 18; +#pragma omp parallel num_threads(4) + { + #pragma omp master + { + f1 (8, 26, 0); + a = 18; + f3 (8, 26, 0); + } + } + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/task-2.c b/libgomp/testsuite/libgomp.c/task-2.c new file mode 100644 index 00000000000..ed6a09c3557 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/task-2.c @@ -0,0 +1,53 @@ +extern void abort (void); + +int +f1 (void) +{ + int a = 6, e = 0; + int nested (int x) + { + return x + a; + } + #pragma omp task + { + int n = nested (5); + if (n != 11) + #pragma omp atomic + e += 1; + } + #pragma omp taskwait + return e; +} + +int +f2 (void) +{ + int a = 6, e = 0; + int nested (int x) + { + return x + a; + } + a = nested (4); + #pragma omp task + { + if (a != 10) + #pragma omp atomic + e += 1; + } + #pragma omp taskwait + return e; +} + +int +main (void) +{ + int e = 0; + #pragma omp parallel num_threads(4) reduction(+:e) + { + e += f1 (); + e += f2 (); + } + if (e) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/task-3.c b/libgomp/testsuite/libgomp.c/task-3.c new file mode 100644 index 00000000000..5657346bd15 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/task-3.c @@ -0,0 +1,70 @@ +/* { dg-do run } */ + +#include <omp.h> +extern void abort (); + +int l = 5; + +int +foo (int i) +{ + int j = 7; + const int k = 8; + #pragma omp task firstprivate (i) shared (j, l) + { + #pragma omp critical + { + j += i; + l += k; + } + } + i++; + #pragma omp task firstprivate (i) shared (j, l) + { + #pragma omp critical + { + j += i; + l += k; + } + } + i++; + #pragma omp task firstprivate (i) shared (j, l) + { + #pragma omp critical + { + j += i; + l += k; + } + } + i++; + #pragma omp task firstprivate (i) shared (j, l) + { + #pragma omp critical + { + j += i; + l += k; + } + } + i++; + #pragma omp taskwait + return (i != 8 * omp_get_thread_num () + 4 + || j != 4 * i - 3 + || k != 8); +} + +int +main (void) +{ + int r = 0; + #pragma omp parallel num_threads (4) reduction(+:r) + if (omp_get_num_threads () != 4) + { + #pragma omp master + l = 133; + } + else if (foo (8 * omp_get_thread_num ())) + r++; + if (r || l != 133) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/task-4.c b/libgomp/testsuite/libgomp.c/task-4.c new file mode 100644 index 00000000000..18435930019 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/task-4.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ + +#include <omp.h> +#include <stdlib.h> +#include <string.h> + +int e; + +void __attribute__((noinline)) +baz (int i, int *p, int j, int *q) +{ + if (p[0] != 1 || p[i] != 3 || q[0] != 2 || q[j] != 4) + #pragma omp atomic + e++; +} + +void __attribute__((noinline)) +foo (int i, int j) +{ + int p[i + 1]; + int q[j + 1]; + memset (p, 0, sizeof (p)); + memset (q, 0, sizeof (q)); + p[0] = 1; + p[i] = 3; + q[0] = 2; + q[j] = 4; + #pragma omp task firstprivate (p, q) + baz (i, p, j, q); +} + +int +main (void) +{ + #pragma omp parallel num_threads (4) + foo (5 + omp_get_thread_num (), 7 + omp_get_thread_num ()); + if (e) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.fortran/allocatable1.f90 b/libgomp/testsuite/libgomp.fortran/allocatable1.f90 new file mode 100644 index 00000000000..1efe2abe959 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/allocatable1.f90 @@ -0,0 +1,81 @@ +! { dg-do run } +!$ use omp_lib + + integer, allocatable :: a(:, :) + integer :: b(6, 3) + integer :: i, j + logical :: k, l + b(:, :) = 16 + l = .false. + if (allocated (a)) call abort +!$omp parallel private (a, b) reduction (.or.:l) + l = l.or.allocated (a) + allocate (a(3, 6)) + l = l.or..not.allocated (a) + l = l.or.size(a).ne.18.or.size(a,1).ne.3.or.size(a,2).ne.6 + a(3, 2) = 1 + b(3, 2) = 1 + deallocate (a) + l = l.or.allocated (a) +!$omp end parallel + if (allocated (a).or.l) call abort + allocate (a(6, 3)) + a(:, :) = 3 + if (.not.allocated (a)) call abort + l = l.or.size(a).ne.18.or.size(a,1).ne.6.or.size(a,2).ne.3 + if (l) call abort +!$omp parallel private (a, b) reduction (.or.:l) + l = l.or..not.allocated (a) + a(3, 2) = 1 + b(3, 2) = 1 +!$omp end parallel + if (l.or..not.allocated (a)) call abort +!$omp parallel firstprivate (a, b) reduction (.or.:l) + l = l.or..not.allocated (a) + l = l.or.size(a).ne.18.or.size(a,1).ne.6.or.size(a,2).ne.3 + do i = 1, 6 + l = l.or.(a(i, 1).ne.3).or.(a(i, 2).ne.3) + l = l.or.(a(i, 3).ne.3).or.(b(i, 1).ne.16) + l = l.or.(b(i, 2).ne.16).or.(b(i, 3).ne.16) + end do + a(:, :) = omp_get_thread_num () + b(:, :) = omp_get_thread_num () +!$omp end parallel + if (any (a.ne.3).or.any (b.ne.16).or.l) call abort + k = .true. +!$omp parallel do firstprivate (a, b, k) lastprivate (a, b) & +!$omp & reduction (.or.:l) + do i = 1, 36 + l = l.or..not.allocated (a) + l = l.or.size(a).ne.18.or.size(a,1).ne.6.or.size(a,2).ne.3 + if (k) then + do j = 1, 6 + l = l.or.(a(j, 1).ne.3).or.(a(j, 2).ne.3) + l = l.or.(a(j, 3).ne.3).or.(b(j, 1).ne.16) + l = l.or.(b(j, 2).ne.16).or.(b(j, 3).ne.16) + end do + k = .false. + end if + a(:, :) = i + 2 + b(:, :) = i + end do + if (any (a.ne.38).or.any (b.ne.36).or.l) call abort + deallocate (a) + if (allocated (a)) call abort + allocate (a (0:1, 0:3)) + a(:, :) = 0 +!$omp parallel do reduction (+:a) reduction (.or.:l) & +!$omp & num_threads(3) schedule(static) + do i = 0, 7 + l = l.or..not.allocated (a) + l = l.or.size(a).ne.8.or.size(a,1).ne.2.or.size(a,2).ne.4 + a(modulo (i, 2), i / 2) = a(modulo (i, 2), i / 2) + i + a(i / 4, modulo (i, 4)) = a(i / 4, modulo (i, 4)) + i + end do + if (l) call abort + do i = 0, 1 + do j = 0, 3 + if (a(i, j) .ne. (5*i + 3*j)) call abort + end do + end do +end diff --git a/libgomp/testsuite/libgomp.fortran/allocatable2.f90 b/libgomp/testsuite/libgomp.fortran/allocatable2.f90 new file mode 100644 index 00000000000..a37616b04b1 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/allocatable2.f90 @@ -0,0 +1,47 @@ +! { dg-do run } +! { dg-require-effective-target tls_runtime } +!$ use omp_lib + + integer, save, allocatable :: a(:, :) + integer, allocatable :: b(:, :) + integer :: n + logical :: l +!$omp threadprivate (a) + if (allocated (a)) call abort + call omp_set_dynamic (.false.) + l = .false. +!$omp parallel num_threads (4) reduction(.or.:l) + allocate (a(-1:1, 7:10)) + a(:, :) = omp_get_thread_num () + 6 + l = l.or..not.allocated (a) + l = l.or.size(a).ne.12.or.size(a,1).ne.3.or.size(a,2).ne.4 +!$omp end parallel + if (l.or.any(a.ne.6)) call abort () +!$omp parallel num_threads (4) copyin (a) reduction(.or.:l) private (b) + l = l.or.allocated (b) + l = l.or..not.allocated (a) + l = l.or.size(a).ne.12.or.size(a,1).ne.3.or.size(a,2).ne.4 + l = l.or.any(a.ne.6) + allocate (b(1, 3)) + a(:, :) = omp_get_thread_num () + 36 + b(:, :) = omp_get_thread_num () + 66 + !$omp single + n = omp_get_thread_num () + !$omp end single copyprivate (a, b) + l = l.or..not.allocated (a) + l = l.or.size(a).ne.12.or.size(a,1).ne.3.or.size(a,2).ne.4 + l = l.or.any(a.ne.(n + 36)) + l = l.or..not.allocated (b) + l = l.or.size(b).ne.3.or.size(b,1).ne.1.or.size(b,2).ne.3 + l = l.or.any(b.ne.(n + 66)) + deallocate (b) + l = l.or.allocated (b) +!$omp end parallel + if (n.lt.0 .or. n.ge.4) call abort + if (l.or.any(a.ne.(n + 36))) call abort +!$omp parallel num_threads (4) reduction(.or.:l) + deallocate (a) + l = l.or.allocated (a) +!$omp end parallel + if (l.or.allocated (a)) call abort +end diff --git a/libgomp/testsuite/libgomp.fortran/allocatable3.f90 b/libgomp/testsuite/libgomp.fortran/allocatable3.f90 new file mode 100644 index 00000000000..fe3714a2b1f --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/allocatable3.f90 @@ -0,0 +1,21 @@ +! { dg-do run } + + integer, allocatable :: a(:) + integer :: i + logical :: l + l = .false. + if (allocated (a)) call abort +!$omp parallel private (a) reduction (.or.:l) + allocate (a (-7:-5)) + l = l.or..not.allocated (a) + l = l.or.size(a).ne.3.or.size(a,1).ne.3 + a(:) = 0 + !$omp do private (a) + do i = 1, 7 + a(:) = i + l = l.or.any (a.ne.i) + end do + l = l.or.any (a.ne.0) + deallocate (a) +!$omp end parallel +end diff --git a/libgomp/testsuite/libgomp.fortran/allocatable4.f90 b/libgomp/testsuite/libgomp.fortran/allocatable4.f90 new file mode 100644 index 00000000000..996578c94fa --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/allocatable4.f90 @@ -0,0 +1,47 @@ +! { dg-do run } + + integer, allocatable :: a(:, :) + integer :: b(6, 3) + integer :: i, j + logical :: k, l + b(:, :) = 16 + l = .false. + if (allocated (a)) call abort +!$omp task private (a, b) shared (l) + l = l.or.allocated (a) + allocate (a(3, 6)) + l = l.or..not.allocated (a) + l = l.or.size(a).ne.18.or.size(a,1).ne.3.or.size(a,2).ne.6 + a(3, 2) = 1 + b(3, 2) = 1 + deallocate (a) + l = l.or.allocated (a) +!$omp end task +!$omp taskwait + if (allocated (a).or.l) call abort + allocate (a(6, 3)) + a(:, :) = 3 + if (.not.allocated (a)) call abort + l = l.or.size(a).ne.18.or.size(a,1).ne.6.or.size(a,2).ne.3 + if (l) call abort +!$omp task private (a, b) shared (l) + l = l.or..not.allocated (a) + a(3, 2) = 1 + b(3, 2) = 1 +!$omp end task +!$omp taskwait + if (l.or..not.allocated (a)) call abort +!$omp task firstprivate (a, b) shared (l) + l = l.or..not.allocated (a) + l = l.or.size(a).ne.18.or.size(a,1).ne.6.or.size(a,2).ne.3 + do i = 1, 6 + l = l.or.(a(i, 1).ne.3).or.(a(i, 2).ne.3) + l = l.or.(a(i, 3).ne.3).or.(b(i, 1).ne.16) + l = l.or.(b(i, 2).ne.16).or.(b(i, 3).ne.16) + end do + a(:, :) = 7 + b(:, :) = 8 +!$omp end task +!$omp taskwait + if (any (a.ne.3).or.any (b.ne.16).or.l) call abort +end diff --git a/libgomp/testsuite/libgomp.fortran/collapse1.f90 b/libgomp/testsuite/libgomp.fortran/collapse1.f90 new file mode 100644 index 00000000000..1ecfa0c9365 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/collapse1.f90 @@ -0,0 +1,26 @@ +! { dg-do run } + +program collapse1 + integer :: i, j, k, a(1:3, 4:6, 5:7) + logical :: l + l = .false. + a(:, :, :) = 0 + !$omp parallel do collapse(4 - 1) schedule(static, 4) + do i = 1, 3 + do j = 4, 6 + do k = 5, 7 + a(i, j, k) = i + j + k + end do + end do + end do + !$omp parallel do collapse(2) reduction(.or.:l) + do i = 1, 3 + do j = 4, 6 + do k = 5, 7 + if (a(i, j, k) .ne. (i + j + k)) l = .true. + end do + end do + end do + !$omp end parallel do + if (l) call abort +end program collapse1 diff --git a/libgomp/testsuite/libgomp.fortran/collapse2.f90 b/libgomp/testsuite/libgomp.fortran/collapse2.f90 new file mode 100644 index 00000000000..77e0dee8260 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/collapse2.f90 @@ -0,0 +1,53 @@ +! { dg-do run } + +program collapse2 + call test1 + call test2 +contains + subroutine test1 + integer :: i, j, k, a(1:3, 4:6, 5:7) + logical :: l + l = .false. + a(:, :, :) = 0 + !$omp parallel do collapse(4 - 1) schedule(static, 4) + do 164 i = 1, 3 + do 164 j = 4, 6 + do 164 k = 5, 7 + a(i, j, k) = i + j + k +164 end do + !$omp parallel do collapse(2) reduction(.or.:l) +firstdo: do i = 1, 3 + do j = 4, 6 + do k = 5, 7 + if (a(i, j, k) .ne. (i + j + k)) l = .true. + end do + end do + end do firstdo + !$omp end parallel do + if (l) call abort + end subroutine test1 + + subroutine test2 + integer :: a(3,3,3), k, kk, kkk, l, ll, lll + !$omp do collapse(3) + do 115 k=1,3 + dokk: do kk=1,3 + do kkk=1,3 + a(k,kk,kkk) = 1 + enddo + enddo dokk +115 continue + if (any(a(1:3,1:3,1:3).ne.1)) call abort + + !$omp do collapse(3) + dol: do 120 l=1,3 + doll: do ll=1,3 + do lll=1,3 + a(l,ll,lll) = 2 + enddo + enddo doll +120 end do dol + if (any(a(1:3,1:3,1:3).ne.2)) call abort + end subroutine test2 + +end program collapse2 diff --git a/libgomp/testsuite/libgomp.fortran/collapse3.f90 b/libgomp/testsuite/libgomp.fortran/collapse3.f90 new file mode 100644 index 00000000000..eac9eac651b --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/collapse3.f90 @@ -0,0 +1,204 @@ +! { dg-do run } + +program collapse3 + call test1 + call test2 (2, 6, -2, 4, 13, 18) + call test3 (2, 6, -2, 4, 13, 18, 1, 1, 1) + call test4 + call test5 (2, 6, -2, 4, 13, 18) + call test6 (2, 6, -2, 4, 13, 18, 1, 1, 1) +contains + subroutine test1 + integer :: i, j, k, a(1:7, -3:5, 12:19), m + logical :: l + l = .false. + a(:, :, :) = 0 + !$omp parallel do collapse (3) lastprivate (i, j, k, m) reduction (.or.:l) + do i = 2, 6 + do j = -2, 4 + do k = 13, 18 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + m = i * 100 + j * 10 + k + end do + end do + end do + if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort + if (m.ne.(600+40+18)) call abort + do i = 1, 7 + do j = -3, 5 + do k = 12, 19 + if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then + if (a(i, j, k).ne.0) print *, i, j, k + else + if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k) + end if + end do + end do + end do + end subroutine test1 + + subroutine test2(v1, v2, v3, v4, v5, v6) + integer :: i, j, k, a(1:7, -3:5, 12:19), m + integer :: v1, v2, v3, v4, v5, v6 + logical :: l + l = .false. + a(:, :, :) = 0 + !$omp parallel do collapse (3) lastprivate (i, j, k, m) reduction (.or.:l) + do i = v1, v2 + do j = v3, v4 + do k = v5, v6 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + m = i * 100 + j * 10 + k + end do + end do + end do + if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort + if (m.ne.(600+40+18)) call abort + do i = 1, 7 + do j = -3, 5 + do k = 12, 19 + if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then + if (a(i, j, k).ne.0) print *, i, j, k + else + if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k) + end if + end do + end do + end do + end subroutine test2 + + subroutine test3(v1, v2, v3, v4, v5, v6, v7, v8, v9) + integer :: i, j, k, a(1:7, -3:5, 12:19), m + integer :: v1, v2, v3, v4, v5, v6, v7, v8, v9 + logical :: l + l = .false. + a(:, :, :) = 0 + !$omp parallel do collapse (3) lastprivate (i, j, k, m) reduction (.or.:l) + do i = v1, v2, v7 + do j = v3, v4, v8 + do k = v5, v6, v9 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + m = i * 100 + j * 10 + k + end do + end do + end do + if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort + if (m.ne.(600+40+18)) call abort + do i = 1, 7 + do j = -3, 5 + do k = 12, 19 + if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then + if (a(i, j, k).ne.0) print *, i, j, k + else + if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k) + end if + end do + end do + end do + end subroutine test3 + + subroutine test4 + integer :: i, j, k, a(1:7, -3:5, 12:19), m + logical :: l + l = .false. + a(:, :, :) = 0 + !$omp parallel do collapse (3) lastprivate (i, j, k, m) reduction (.or.:l) & + !$omp& schedule (dynamic, 5) + do i = 2, 6 + do j = -2, 4 + do k = 13, 18 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + m = i * 100 + j * 10 + k + end do + end do + end do + if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort + if (m.ne.(600+40+18)) call abort + do i = 1, 7 + do j = -3, 5 + do k = 12, 19 + if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then + if (a(i, j, k).ne.0) print *, i, j, k + else + if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k) + end if + end do + end do + end do + end subroutine test4 + + subroutine test5(v1, v2, v3, v4, v5, v6) + integer :: i, j, k, a(1:7, -3:5, 12:19), m + integer :: v1, v2, v3, v4, v5, v6 + logical :: l + l = .false. + a(:, :, :) = 0 + !$omp parallel do collapse (3) lastprivate (i, j, k, m) reduction (.or.:l) & + !$omp & schedule (guided) + do i = v1, v2 + do j = v3, v4 + do k = v5, v6 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + m = i * 100 + j * 10 + k + end do + end do + end do + if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort + if (m.ne.(600+40+18)) call abort + do i = 1, 7 + do j = -3, 5 + do k = 12, 19 + if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then + if (a(i, j, k).ne.0) print *, i, j, k + else + if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k) + end if + end do + end do + end do + end subroutine test5 + + subroutine test6(v1, v2, v3, v4, v5, v6, v7, v8, v9) + integer :: i, j, k, a(1:7, -3:5, 12:19), m + integer :: v1, v2, v3, v4, v5, v6, v7, v8, v9 + logical :: l + l = .false. + a(:, :, :) = 0 + !$omp parallel do collapse (3) lastprivate (i, j, k, m) reduction (.or.:l) & + !$omp & schedule (dynamic) + do i = v1, v2, v7 + do j = v3, v4, v8 + do k = v5, v6, v9 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + m = i * 100 + j * 10 + k + end do + end do + end do + if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort + if (m.ne.(600+40+18)) call abort + do i = 1, 7 + do j = -3, 5 + do k = 12, 19 + if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then + if (a(i, j, k).ne.0) print *, i, j, k + else + if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k) + end if + end do + end do + end do + end subroutine test6 + +end program collapse3 diff --git a/libgomp/testsuite/libgomp.fortran/collapse4.f90 b/libgomp/testsuite/libgomp.fortran/collapse4.f90 new file mode 100644 index 00000000000..f19b0f6c695 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/collapse4.f90 @@ -0,0 +1,12 @@ +! { dg-do run } + + integer :: i, j, k + !$omp parallel do lastprivate (i, j, k) collapse (3) + do i = 0, 17 + do j = 0, 6 + do k = 0, 5 + end do + end do + end do + if (i .ne. 18 .or. j .ne. 7 .or. k .ne. 6) call abort +end diff --git a/libgomp/testsuite/libgomp.fortran/lastprivate1.f90 b/libgomp/testsuite/libgomp.fortran/lastprivate1.f90 new file mode 100644 index 00000000000..91bb96ca75a --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/lastprivate1.f90 @@ -0,0 +1,126 @@ +program lastprivate + integer :: i + common /c/ i + !$omp parallel num_threads (4) + call test1 + !$omp end parallel + if (i .ne. 21) call abort + !$omp parallel num_threads (4) + call test2 + !$omp end parallel + if (i .ne. 64) call abort + !$omp parallel num_threads (4) + call test3 + !$omp end parallel + if (i .ne. 14) call abort + call test4 + call test5 + call test6 + call test7 + call test8 + call test9 + call test10 + call test11 + call test12 +contains + subroutine test1 + integer :: i + common /c/ i + !$omp do lastprivate (i) + do i = 1, 20 + end do + end subroutine test1 + subroutine test2 + integer :: i + common /c/ i + !$omp do lastprivate (i) + do i = 7, 61, 3 + end do + end subroutine test2 + function ret3 () + integer :: ret3 + ret3 = 3 + end function ret3 + subroutine test3 + integer :: i + common /c/ i + !$omp do lastprivate (i) + do i = -10, 11, ret3 () + end do + end subroutine test3 + subroutine test4 + integer :: j + !$omp parallel do lastprivate (j) num_threads (4) default (none) + do j = 1, 20 + end do + if (j .ne. 21) call abort + end subroutine test4 + subroutine test5 + integer :: j + !$omp parallel do lastprivate (j) num_threads (4) default (none) + do j = 7, 61, 3 + end do + if (j .ne. 64) call abort + end subroutine test5 + subroutine test6 + integer :: j + !$omp parallel do lastprivate (j) num_threads (4) default (none) + do j = -10, 11, ret3 () + end do + if (j .ne. 14) call abort + end subroutine test6 + subroutine test7 + integer :: i + common /c/ i + !$omp parallel do lastprivate (i) num_threads (4) default (none) + do i = 1, 20 + end do + if (i .ne. 21) call abort + end subroutine test7 + subroutine test8 + integer :: i + common /c/ i + !$omp parallel do lastprivate (i) num_threads (4) default (none) + do i = 7, 61, 3 + end do + if (i .ne. 64) call abort + end subroutine test8 + subroutine test9 + integer :: i + common /c/ i + !$omp parallel do lastprivate (i) num_threads (4) default (none) + do i = -10, 11, ret3 () + end do + if (i .ne. 14) call abort + end subroutine test9 + subroutine test10 + integer :: i + common /c/ i + !$omp parallel num_threads (4) default (none) shared (i) + !$omp do lastprivate (i) + do i = 1, 20 + end do + !$omp end parallel + if (i .ne. 21) call abort + end subroutine test10 + subroutine test11 + integer :: i + common /c/ i + !$omp parallel num_threads (4) default (none) shared (i) + !$omp do lastprivate (i) + do i = 7, 61, 3 + end do + !$omp end parallel + if (i .ne. 64) call abort + end subroutine test11 + subroutine test12 + integer :: i + common /c/ i + !$omp parallel num_threads (4) default (none) shared (i) + !$omp do lastprivate (i) + do i = -10, 11, ret3 () + end do + !$omp end parallel + if (i .ne. 14) call abort + end subroutine test12 +end program lastprivate diff --git a/libgomp/testsuite/libgomp.fortran/lastprivate2.f90 b/libgomp/testsuite/libgomp.fortran/lastprivate2.f90 new file mode 100644 index 00000000000..6d7e11eab00 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/lastprivate2.f90 @@ -0,0 +1,141 @@ +program lastprivate + integer :: i, k + common /c/ i, k + !$omp parallel num_threads (4) + call test1 + !$omp end parallel + if (i .ne. 21 .or. k .ne. 20) call abort + !$omp parallel num_threads (4) + call test2 + !$omp end parallel + if (i .ne. 64 .or. k .ne. 61) call abort + !$omp parallel num_threads (4) + call test3 + !$omp end parallel + if (i .ne. 14 .or. k .ne. 11) call abort + call test4 + call test5 + call test6 + call test7 + call test8 + call test9 + call test10 + call test11 + call test12 +contains + subroutine test1 + integer :: i, k + common /c/ i, k + !$omp do lastprivate (i, k) + do i = 1, 20 + k = i + end do + end subroutine test1 + subroutine test2 + integer :: i, k + common /c/ i, k + !$omp do lastprivate (i, k) + do i = 7, 61, 3 + k = i + end do + end subroutine test2 + function ret3 () + integer :: ret3 + ret3 = 3 + end function ret3 + subroutine test3 + integer :: i, k + common /c/ i, k + !$omp do lastprivate (i, k) + do i = -10, 11, ret3 () + k = i + end do + end subroutine test3 + subroutine test4 + integer :: j, l + !$omp parallel do lastprivate (j, l) num_threads (4) + do j = 1, 20 + l = j + end do + if (j .ne. 21 .or. l .ne. 20) call abort + end subroutine test4 + subroutine test5 + integer :: j, l + l = 77 + !$omp parallel do lastprivate (j, l) num_threads (4) firstprivate (l) + do j = 7, 61, 3 + l = j + end do + if (j .ne. 64 .or. l .ne. 61) call abort + end subroutine test5 + subroutine test6 + integer :: j, l + !$omp parallel do lastprivate (j, l) num_threads (4) + do j = -10, 11, ret3 () + l = j + end do + if (j .ne. 14 .or. l .ne. 11) call abort + end subroutine test6 + subroutine test7 + integer :: i, k + common /c/ i, k + !$omp parallel do lastprivate (i, k) num_threads (4) + do i = 1, 20 + k = i + end do + if (i .ne. 21 .or. k .ne. 20) call abort + end subroutine test7 + subroutine test8 + integer :: i, k + common /c/ i, k + !$omp parallel do lastprivate (i, k) num_threads (4) + do i = 7, 61, 3 + k = i + end do + if (i .ne. 64 .or. k .ne. 61) call abort + end subroutine test8 + subroutine test9 + integer :: i, k + common /c/ i, k + k = 77 + !$omp parallel do lastprivate (i, k) num_threads (4) firstprivate (k) + do i = -10, 11, ret3 () + k = i + end do + if (i .ne. 14 .or. k .ne. 11) call abort + end subroutine test9 + subroutine test10 + integer :: i, k + common /c/ i, k + !$omp parallel num_threads (4) + !$omp do lastprivate (i, k) + do i = 1, 20 + k = i + end do + !$omp end parallel + if (i .ne. 21 .or. k .ne. 20) call abort + end subroutine test10 + subroutine test11 + integer :: i, k + common /c/ i, k + !$omp parallel num_threads (4) + !$omp do lastprivate (i, k) + do i = 7, 61, 3 + k = i + end do + !$omp end parallel + if (i .ne. 64 .or. k .ne. 61) call abort + end subroutine test11 + subroutine test12 + integer :: i, k + common /c/ i, k + k = 77 + !$omp parallel num_threads (4) + !$omp do lastprivate (i, k) firstprivate (k) + do i = -10, 11, ret3 () + k = i + end do + !$omp end parallel + if (i .ne. 14 .or. k .ne. 11) call abort + end subroutine test12 +end program lastprivate diff --git a/libgomp/testsuite/libgomp.fortran/lib4.f90 b/libgomp/testsuite/libgomp.fortran/lib4.f90 new file mode 100644 index 00000000000..cbb984574ff --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/lib4.f90 @@ -0,0 +1,16 @@ +! { dg-do run } + +program lib4 + use omp_lib + integer (omp_sched_kind) :: kind + integer :: modifier + call omp_set_schedule (omp_sched_static, 32) + call omp_get_schedule (kind, modifier) + if (kind.ne.omp_sched_static.or.modifier.ne.32) call abort + call omp_set_schedule (omp_sched_dynamic, 4) + call omp_get_schedule (kind, modifier) + if (kind.ne.omp_sched_dynamic.or.modifier.ne.4) call abort + if (omp_get_thread_limit ().lt.0) call abort + call omp_set_max_active_levels (6) + if (omp_get_max_active_levels ().ne.6) call abort +end program lib4 diff --git a/libgomp/testsuite/libgomp.fortran/lock-1.f90 b/libgomp/testsuite/libgomp.fortran/lock-1.f90 new file mode 100644 index 00000000000..d7d3e3fd6cc --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/lock-1.f90 @@ -0,0 +1,24 @@ +! { dg-do run } + + use omp_lib + + integer (kind = omp_nest_lock_kind) :: lock + logical :: l + + l = .false. + call omp_init_nest_lock (lock) + if (omp_test_nest_lock (lock) .ne. 1) call abort + if (omp_test_nest_lock (lock) .ne. 2) call abort +!$omp parallel if (.false.) reduction (.or.:l) + ! In OpenMP 2.5 this was supposed to return 3, + ! but in OpenMP 3.0 the parallel region has a different + ! task and omp_*_lock_t are owned by tasks, not by threads. + if (omp_test_nest_lock (lock) .ne. 0) l = .true. +!$omp end parallel + if (l) call abort + if (omp_test_nest_lock (lock) .ne. 3) call abort + call omp_unset_nest_lock (lock) + call omp_unset_nest_lock (lock) + call omp_unset_nest_lock (lock) + call omp_destroy_nest_lock (lock) +end diff --git a/libgomp/testsuite/libgomp.fortran/lock-2.f90 b/libgomp/testsuite/libgomp.fortran/lock-2.f90 new file mode 100644 index 00000000000..9965139b9ba --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/lock-2.f90 @@ -0,0 +1,24 @@ +! { dg-do run } + + use omp_lib + + integer (kind = omp_nest_lock_kind) :: lock + logical :: l + + l = .false. + call omp_init_nest_lock (lock) +!$omp parallel num_threads (1) reduction (.or.:l) + if (omp_test_nest_lock (lock) .ne. 1) call abort + if (omp_test_nest_lock (lock) .ne. 2) call abort +!$omp task if (.false.) shared (lock, l) + if (omp_test_nest_lock (lock) .ne. 0) l = .true. +!$omp end task +!$omp taskwait + if (omp_test_nest_lock (lock) .ne. 3) l = .true. + call omp_unset_nest_lock (lock) + call omp_unset_nest_lock (lock) + call omp_unset_nest_lock (lock) +!$omp end parallel + if (l) call abort + call omp_destroy_nest_lock (lock) +end diff --git a/libgomp/testsuite/libgomp.fortran/nested1.f90 b/libgomp/testsuite/libgomp.fortran/nested1.f90 new file mode 100644 index 00000000000..98c4322d0bf --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/nested1.f90 @@ -0,0 +1,87 @@ +! { dg-do run } +program nested1 + use omp_lib + integer :: e1, e2, e3, e + integer :: tn1, tn2, tn3 + e1 = 0 + e2 = 0 + e3 = 0 + call omp_set_nested (.true.) + call omp_set_dynamic (.false.) + if (omp_in_parallel ()) call abort + if (omp_get_num_threads ().ne.1) call abort + if (omp_get_level ().ne.0) call abort + if (omp_get_ancestor_thread_num (0).ne.0) call abort + if (omp_get_ancestor_thread_num (-1).ne.-1) call abort + if (omp_get_ancestor_thread_num (1).ne.-1) call abort + if (omp_get_team_size (0).ne.1) call abort + if (omp_get_team_size (-1).ne.-1) call abort + if (omp_get_team_size (1).ne.-1) call abort + if (omp_get_active_level ().ne.0) call abort +!$omp parallel num_threads (4) private (e, tn1) + e = 0 + tn1 = omp_get_thread_num () + if (.not.omp_in_parallel ()) e = e + 1 + if (omp_get_num_threads ().ne.4) e = e + 1 + if (tn1.lt.0.or.tn1.ge.4) e = e + 1 + if (omp_get_level ().ne.1) e = e + 1 + if (omp_get_ancestor_thread_num (0).ne.0) e = e + 1 + if (omp_get_ancestor_thread_num (1).ne.tn1) e = e + 1 + if (omp_get_ancestor_thread_num (-1).ne.-1) e = e + 1 + if (omp_get_ancestor_thread_num (2).ne.-1) e = e + 1 + if (omp_get_team_size (0).ne.1) e = e + 1 + if (omp_get_team_size (1).ne.4) e = e + 1 + if (omp_get_team_size (-1).ne.-1) e = e + 1 + if (omp_get_team_size (2).ne.-1) e = e + 1 + if (omp_get_active_level ().ne.1) e = e + 1 + !$omp atomic + e1 = e1 + e +!$omp parallel num_threads (5) if (.false.) firstprivate (tn1) & +!$omp& private (e, tn2) + e = 0 + tn2 = omp_get_thread_num () + if (.not.omp_in_parallel ()) e = e + 1 + if (omp_get_num_threads ().ne.1) e = e + 1 + if (tn2.ne.0) e = e + 1 + if (omp_get_level ().ne.2) e = e + 1 + if (omp_get_ancestor_thread_num (0).ne.0) e = e + 1 + if (omp_get_ancestor_thread_num (1).ne.tn1) e = e + 1 + if (omp_get_ancestor_thread_num (2).ne.tn2) e = e + 1 + if (omp_get_ancestor_thread_num (-1).ne.-1) e = e + 1 + if (omp_get_ancestor_thread_num (3).ne.-1) e = e + 1 + if (omp_get_team_size (0).ne.1) e = e + 1 + if (omp_get_team_size (1).ne.4) e = e + 1 + if (omp_get_team_size (2).ne.1) e = e + 1 + if (omp_get_team_size (-1).ne.-1) e = e + 1 + if (omp_get_team_size (3).ne.-1) e = e + 1 + if (omp_get_active_level ().ne.1) e = e + 1 + !$omp atomic + e2 = e2 + e +!$omp parallel num_threads (2) firstprivate (tn1, tn2) & +!$omp& private (e, tn3) + e = 0 + tn3 = omp_get_thread_num () + if (.not.omp_in_parallel ()) e = e + 1 + if (omp_get_num_threads ().ne.2) e = e + 1 + if (tn3.lt.0.or.tn3.ge.2) e = e + 1 + if (omp_get_level ().ne.3) e = e + 1 + if (omp_get_ancestor_thread_num (0).ne.0) e = e + 1 + if (omp_get_ancestor_thread_num (1).ne.tn1) e = e + 1 + if (omp_get_ancestor_thread_num (2).ne.tn2) e = e + 1 + if (omp_get_ancestor_thread_num (3).ne.tn3) e = e + 1 + if (omp_get_ancestor_thread_num (-1).ne.-1) e = e + 1 + if (omp_get_ancestor_thread_num (4).ne.-1) e = e + 1 + if (omp_get_team_size (0).ne.1) e = e + 1 + if (omp_get_team_size (1).ne.4) e = e + 1 + if (omp_get_team_size (2).ne.1) e = e + 1 + if (omp_get_team_size (3).ne.2) e = e + 1 + if (omp_get_team_size (-1).ne.-1) e = e + 1 + if (omp_get_team_size (4).ne.-1) e = e + 1 + if (omp_get_active_level ().ne.2) e = e + 1 + !$omp atomic + e3 = e3 + e +!$omp end parallel +!$omp end parallel +!$omp end parallel + if (e1.ne.0.or.e2.ne.0.or.e3.ne.0) call abort +end program nested1 diff --git a/libgomp/testsuite/libgomp.fortran/nestedfn4.f90 b/libgomp/testsuite/libgomp.fortran/nestedfn4.f90 new file mode 100644 index 00000000000..c987bf440b0 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/nestedfn4.f90 @@ -0,0 +1,41 @@ +program foo + integer :: i, j, k + integer :: a(10), c(10) + k = 2 + a(:) = 0 + call test1 + call test2 + do i = 1, 10 + if (a(i) .ne. 10 * i) call abort + end do + !$omp parallel do reduction (+:c) + do i = 1, 10 + c = c + a + end do + do i = 1, 10 + if (c(i) .ne. 10 * a(i)) call abort + end do + !$omp parallel do lastprivate (j) + do j = 1, 10, k + end do + if (j .ne. 11) call abort +contains + subroutine test1 + integer :: i + integer :: b(10) + do i = 1, 10 + b(i) = i + end do + c(:) = 0 + !$omp parallel do reduction (+:a) + do i = 1, 10 + a = a + b + end do + end subroutine test1 + subroutine test2 + !$omp parallel do lastprivate (j) + do j = 1, 10, k + end do + if (j .ne. 11) call abort + end subroutine test2 +end program foo diff --git a/libgomp/testsuite/libgomp.fortran/strassen.f90 b/libgomp/testsuite/libgomp.fortran/strassen.f90 new file mode 100644 index 00000000000..b44982665a6 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/strassen.f90 @@ -0,0 +1,75 @@ +! { dg-options "-O2" } + +program strassen_matmul + use omp_lib + integer, parameter :: N = 1024 + double precision, save :: A(N,N), B(N,N), C(N,N), D(N,N) + double precision :: start, end + + call random_seed + call random_number (A) + call random_number (B) + start = omp_get_wtime () + C = matmul (A, B) + end = omp_get_wtime () + write(*,'(a, f10.6)') ' Time for matmul = ', end - start + D = 0 + start = omp_get_wtime () + call strassen (A, B, D, N) + end = omp_get_wtime () + write(*,'(a, f10.6)') ' Time for Strassen = ', end - start + if (sqrt (sum ((C - D) ** 2)) / N .gt. 0.1) call abort + D = 0 + start = omp_get_wtime () +!$omp parallel +!$omp single + call strassen (A, B, D, N) +!$omp end single nowait +!$omp end parallel + end = omp_get_wtime () + write(*,'(a, f10.6)') ' Time for Strassen MP = ', end - start + if (sqrt (sum ((C - D) ** 2)) / N .gt. 0.1) call abort + +contains + + recursive subroutine strassen (A, B, C, N) + integer, intent(in) :: N + double precision, intent(in) :: A(N,N), B(N,N) + double precision, intent(out) :: C(N,N) + double precision :: T(N/2,N/2,7) + integer :: K, L + + if (iand (N,1) .ne. 0 .or. N < 64) then + C = matmul (A, B) + return + end if + K = N / 2 + L = N / 2 + 1 +!$omp task shared (A, B, T) + call strassen (A(:K,:K) + A(L:,L:), B(:K,:K) + B(L:,L:), T(:,:,1), K) +!$omp end task +!$omp task shared (A, B, T) + call strassen (A(L:,:K) + A(L:,L:), B(:K,:K), T(:,:,2), K) +!$omp end task +!$omp task shared (A, B, T) + call strassen (A(:K,:K), B(:K,L:) - B(L:,L:), T(:,:,3), K) +!$omp end task +!$omp task shared (A, B, T) + call strassen (A(L:,L:), B(L:,:K) - B(:K,:K), T(:,:,4), K) +!$omp end task +!$omp task shared (A, B, T) + call strassen (A(:K,:K) + A(:K,L:), B(L:,L:), T(:,:,5), K) +!$omp end task +!$omp task shared (A, B, T) + call strassen (A(L:,:K) - A(:K,:K), B(:K,:K) + B(:K,L:), T(:,:,6), K) +!$omp end task +!$omp task shared (A, B, T) + call strassen (A(:K,L:) - A(L:,L:), B(L:,:K) + B(L:,L:), T(:,:,7), K) +!$omp end task +!$omp taskwait + C(:K,:K) = T(:,:,1) + T(:,:,4) - T(:,:,5) + T(:,:,7) + C(L:,:K) = T(:,:,2) + T(:,:,4) + C(:K,L:) = T(:,:,3) + T(:,:,5) + C(L:,L:) = T(:,:,1) - T(:,:,2) + T(:,:,3) + T(:,:,6) + end subroutine strassen +end diff --git a/libgomp/testsuite/libgomp.fortran/tabs1.f90 b/libgomp/testsuite/libgomp.fortran/tabs1.f90 new file mode 100644 index 00000000000..4f3d4f5b435 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/tabs1.f90 @@ -0,0 +1,12 @@ + if (b().ne.2) call abort +contains +subroutine a +!$omp parallel + !$omp end parallel + end subroutine a +function b() + integer :: b + b = 1 + !$ b = 2 +end function b + end diff --git a/libgomp/testsuite/libgomp.fortran/tabs2.f b/libgomp/testsuite/libgomp.fortran/tabs2.f new file mode 100644 index 00000000000..7aed5498d34 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/tabs2.f @@ -0,0 +1,13 @@ +! { dg-options "-ffixed-form" } + if (b().ne.2) call abort + contains + subroutine a +!$omp parallel +!$omp end parallel + end subroutine a + function b() + integer :: b + b = 1 +!$ b = 2 + end function b + end diff --git a/libgomp/testsuite/libgomp.fortran/task1.f90 b/libgomp/testsuite/libgomp.fortran/task1.f90 new file mode 100644 index 00000000000..df57cb83168 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/task1.f90 @@ -0,0 +1,27 @@ +! { dg-do run } + +program tasktest + use omp_lib + integer :: i, j + common /tasktest_j/ j + j = 0 + !$omp parallel private (i) + i = omp_get_thread_num () + if (i.lt.2) then + !$omp task if (.false.) default(firstprivate) + call subr (i + 1) + !$omp end task + end if + !$omp end parallel + if (j.gt.0) call abort +contains + subroutine subr (i) + use omp_lib + integer :: i, j + common /tasktest_j/ j + if (omp_get_thread_num ().ne.(i - 1)) then + !$omp atomic + j = j + 1 + end if + end subroutine subr +end program tasktest diff --git a/libgomp/testsuite/libgomp.fortran/task2.f90 b/libgomp/testsuite/libgomp.fortran/task2.f90 new file mode 100644 index 00000000000..24ffee53ac8 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/task2.f90 @@ -0,0 +1,142 @@ + integer :: err + err = 0 +!$omp parallel num_threads (4) default (none) shared (err) +!$omp single + call test +!$omp end single +!$omp end parallel + if (err.ne.0) call abort +contains + subroutine check (x, y, l) + integer :: x, y + logical :: l + l = l .or. x .ne. y + end subroutine check + + subroutine foo (c, d, e, f, g, h, i, j, k, n) + use omp_lib + integer :: n + character (len = *) :: c + character (len = n) :: d + integer, dimension (2, 3:5, n) :: e + integer, dimension (2, 3:n, n) :: f + character (len = *), dimension (5, 3:n) :: g + character (len = n), dimension (5, 3:n) :: h + real, dimension (:, :, :) :: i + double precision, dimension (3:, 5:, 7:) :: j + integer, dimension (:, :, :) :: k + logical :: l + integer :: p, q, r + character (len = n) :: s + integer, dimension (2, 3:5, n) :: t + integer, dimension (2, 3:n, n) :: u + character (len = n), dimension (5, 3:n) :: v + character (len = 2 * n + 24) :: w + integer :: x, z + character (len = 1) :: y + s = 'PQRSTUV' + forall (p = 1:2, q = 3:5, r = 1:7) t(p, q, r) = -10 + p - q + 2 * r + forall (p = 1:2, q = 3:7, r = 1:7) u(p, q, r) = 30 - p + q - 2 * r + forall (p = 1:5, q = 3:7, p + q .le. 8) v(p, q) = '_+|/Oo_' + forall (p = 1:5, q = 3:7, p + q .gt. 8) v(p, q) = '///|||!' +!$omp task default (none) firstprivate (c, d, e, f, g, h, i, j, k) & +!$omp & firstprivate (s, t, u, v) private (l, p, q, r, w, x, y) shared (err) + l = .false. + l = l .or. c .ne. 'abcdefghijkl' + l = l .or. d .ne. 'ABCDEFG' + l = l .or. s .ne. 'PQRSTUV' + do 100, p = 1, 2 + do 100, q = 3, 7 + do 100, r = 1, 7 + if (q .lt. 6) l = l .or. e(p, q, r) .ne. 5 + p + q + 2 * r + l = l .or. f(p, q, r) .ne. 25 + p + q + 2 * r + if (r .lt. 6 .and. q + r .le. 8) l = l .or. g(r, q) .ne. '0123456789AB' + if (r .lt. 6 .and. q + r .gt. 8) l = l .or. g(r, q) .ne. '9876543210ZY' + if (r .lt. 6 .and. q + r .le. 8) l = l .or. h(r, q) .ne. '0123456' + if (r .lt. 6 .and. q + r .gt. 8) l = l .or. h(r, q) .ne. '9876543' + if (q .lt. 6) l = l .or. t(p, q, r) .ne. -10 + p - q + 2 * r + l = l .or. u(p, q, r) .ne. 30 - p + q - 2 * r + if (r .lt. 6 .and. q + r .le. 8) l = l .or. v(r, q) .ne. '_+|/Oo_' + if (r .lt. 6 .and. q + r .gt. 8) l = l .or. v(r, q) .ne. '///|||!' +100 continue + do 101, p = 3, 5 + do 101, q = 2, 6 + do 101, r = 1, 7 + l = l .or. i(p - 2, q - 1, r) .ne. 7.5 * p * q * r + l = l .or. j(p, q + 3, r + 6) .ne. 9.5 * p * q * r +101 continue + do 102, p = 1, 5 + do 102, q = 4, 6 + l = l .or. k(p, 1, q - 3) .ne. 19 + p + 7 + 3 * q +102 continue + call check (size (e, 1), 2, l) + call check (size (e, 2), 3, l) + call check (size (e, 3), 7, l) + call check (size (e), 42, l) + call check (size (f, 1), 2, l) + call check (size (f, 2), 5, l) + call check (size (f, 3), 7, l) + call check (size (f), 70, l) + call check (size (g, 1), 5, l) + call check (size (g, 2), 5, l) + call check (size (g), 25, l) + call check (size (h, 1), 5, l) + call check (size (h, 2), 5, l) + call check (size (h), 25, l) + call check (size (i, 1), 3, l) + call check (size (i, 2), 5, l) + call check (size (i, 3), 7, l) + call check (size (i), 105, l) + call check (size (j, 1), 4, l) + call check (size (j, 2), 5, l) + call check (size (j, 3), 7, l) + call check (size (j), 140, l) + call check (size (k, 1), 5, l) + call check (size (k, 2), 1, l) + call check (size (k, 3), 3, l) + call check (size (k), 15, l) + if (l) then +!$omp atomic + err = err + 1 + end if +!$omp end task + c = '' + d = '' + e(:, :, :) = 199 + f(:, :, :) = 198 + g(:, :) = '' + h(:, :) = '' + i(:, :, :) = 7.0 + j(:, :, :) = 8.0 + k(:, :, :) = 9 + s = '' + t(:, :, :) = 10 + u(:, :, :) = 11 + v(:, :) = '' + end subroutine foo + + subroutine test + character (len = 12) :: c + character (len = 7) :: d + integer, dimension (2, 3:5, 7) :: e + integer, dimension (2, 3:7, 7) :: f + character (len = 12), dimension (5, 3:7) :: g + character (len = 7), dimension (5, 3:7) :: h + real, dimension (3:5, 2:6, 1:7) :: i + double precision, dimension (3:6, 2:6, 1:7) :: j + integer, dimension (1:5, 7:7, 4:6) :: k + integer :: p, q, r + c = 'abcdefghijkl' + d = 'ABCDEFG' + forall (p = 1:2, q = 3:5, r = 1:7) e(p, q, r) = 5 + p + q + 2 * r + forall (p = 1:2, q = 3:7, r = 1:7) f(p, q, r) = 25 + p + q + 2 * r + forall (p = 1:5, q = 3:7, p + q .le. 8) g(p, q) = '0123456789AB' + forall (p = 1:5, q = 3:7, p + q .gt. 8) g(p, q) = '9876543210ZY' + forall (p = 1:5, q = 3:7, p + q .le. 8) h(p, q) = '0123456' + forall (p = 1:5, q = 3:7, p + q .gt. 8) h(p, q) = '9876543' + forall (p = 3:5, q = 2:6, r = 1:7) i(p, q, r) = 7.5 * p * q * r + forall (p = 3:6, q = 2:6, r = 1:7) j(p, q, r) = 9.5 * p * q * r + forall (p = 1:5, q = 7:7, r = 4:6) k(p, q, r) = 19 + p + q + 3 * r + call foo (c, d, e, f, g, h, i, j, k, 7) + end subroutine test +end diff --git a/libgomp/testsuite/libgomp.fortran/vla4.f90 b/libgomp/testsuite/libgomp.fortran/vla4.f90 index 58caabc6248..cdd4849b6ad 100644 --- a/libgomp/testsuite/libgomp.fortran/vla4.f90 +++ b/libgomp/testsuite/libgomp.fortran/vla4.f90 @@ -94,7 +94,7 @@ contains forall (p = 1:2, q = 3:7, r = 1:7) u(p, q, r) = 30 - x - p + q - 2 * r forall (p = 1:5, q = 3:7, p + q .le. 8) v(p, q) = w(1:7) forall (p = 1:5, q = 3:7, p + q .gt. 8) v(p, q) = w(20:26) -!$omp barrier +!$omp barrier ! { dg-warning "may not be closely nested" } y = '' if (x .eq. 0) y = '0' if (x .eq. 1) y = '1' diff --git a/libgomp/testsuite/libgomp.fortran/vla5.f90 b/libgomp/testsuite/libgomp.fortran/vla5.f90 index 5c889f9923a..9b611505219 100644 --- a/libgomp/testsuite/libgomp.fortran/vla5.f90 +++ b/libgomp/testsuite/libgomp.fortran/vla5.f90 @@ -66,7 +66,7 @@ contains forall (p = 1:2, q = 3:7, r = 1:7) u(p, q, r) = 30 - x - p + q - 2 * r forall (p = 1:5, q = 3:7, p + q .le. 8) v(p, q) = w(1:7) forall (p = 1:5, q = 3:7, p + q .gt. 8) v(p, q) = w(20:26) -!$omp barrier +!$omp barrier ! { dg-warning "may not be closely nested" } y = '' if (x .eq. 0) y = '0' if (x .eq. 1) y = '1' diff --git a/libgomp/work.c b/libgomp/work.c index cd20c9dbe73..b48a5e3244b 100644 --- a/libgomp/work.c +++ b/libgomp/work.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2005 Free Software Foundation, Inc. +/* Copyright (C) 2005, 2008 Free Software Foundation, Inc. Contributed by Richard Henderson <rth@redhat.com>. This file is part of the GNU OpenMP Library (libgomp). @@ -29,39 +29,138 @@ of threads. */ #include "libgomp.h" +#include <stddef.h> #include <stdlib.h> #include <string.h> -/* Create a new work share structure. */ +/* Allocate a new work share structure, preferably from current team's + free gomp_work_share cache. */ -struct gomp_work_share * -gomp_new_work_share (bool ordered, unsigned nthreads) +static struct gomp_work_share * +alloc_work_share (struct gomp_team *team) { struct gomp_work_share *ws; - size_t size; + unsigned int i; - size = sizeof (*ws); - if (ordered) - size += nthreads * sizeof (ws->ordered_team_ids[0]); + /* This is called in a critical section. */ + if (team->work_share_list_alloc != NULL) + { + ws = team->work_share_list_alloc; + team->work_share_list_alloc = ws->next_free; + return ws; + } - ws = gomp_malloc_cleared (size); - gomp_mutex_init (&ws->lock); - ws->ordered_owner = -1; +#ifdef HAVE_SYNC_BUILTINS + ws = team->work_share_list_free; + /* We need atomic read from work_share_list_free, + as free_work_share can be called concurrently. */ + __asm ("" : "+r" (ws)); + + if (ws && ws->next_free) + { + struct gomp_work_share *next = ws->next_free; + ws->next_free = NULL; + team->work_share_list_alloc = next->next_free; + return next; + } +#else + gomp_mutex_lock (&team->work_share_list_free_lock); + ws = team->work_share_list_free; + if (ws) + { + team->work_share_list_alloc = ws->next_free; + team->work_share_list_free = NULL; + gomp_mutex_unlock (&team->work_share_list_free_lock); + return ws; + } + gomp_mutex_unlock (&team->work_share_list_free_lock); +#endif + team->work_share_chunk *= 2; + ws = gomp_malloc (team->work_share_chunk * sizeof (struct gomp_work_share)); + ws->next_alloc = team->work_shares[0].next_alloc; + team->work_shares[0].next_alloc = ws; + team->work_share_list_alloc = &ws[1]; + for (i = 1; i < team->work_share_chunk - 1; i++) + ws[i].next_free = &ws[i + 1]; + ws[i].next_free = NULL; return ws; } +/* Initialize an already allocated struct gomp_work_share. + This shouldn't touch the next_alloc field. */ + +void +gomp_init_work_share (struct gomp_work_share *ws, bool ordered, + unsigned nthreads) +{ + gomp_mutex_init (&ws->lock); + if (__builtin_expect (ordered, 0)) + { +#define INLINE_ORDERED_TEAM_IDS_CNT \ + ((sizeof (struct gomp_work_share) \ + - offsetof (struct gomp_work_share, inline_ordered_team_ids)) \ + / sizeof (((struct gomp_work_share *) 0)->inline_ordered_team_ids[0])) + + if (nthreads > INLINE_ORDERED_TEAM_IDS_CNT) + ws->ordered_team_ids + = gomp_malloc (nthreads * sizeof (*ws->ordered_team_ids)); + else + ws->ordered_team_ids = ws->inline_ordered_team_ids; + memset (ws->ordered_team_ids, '\0', + nthreads * sizeof (*ws->ordered_team_ids)); + ws->ordered_num_used = 0; + ws->ordered_owner = -1; + ws->ordered_cur = 0; + } + else + ws->ordered_team_ids = NULL; + gomp_ptrlock_init (&ws->next_ws, NULL); + ws->threads_completed = 0; +} -/* Free a work share structure. */ +/* Do any needed destruction of gomp_work_share fields before it + is put back into free gomp_work_share cache or freed. */ -static void -free_work_share (struct gomp_work_share *ws) +void +gomp_fini_work_share (struct gomp_work_share *ws) { gomp_mutex_destroy (&ws->lock); - free (ws); + if (ws->ordered_team_ids != ws->inline_ordered_team_ids) + free (ws->ordered_team_ids); + gomp_ptrlock_destroy (&ws->next_ws); } +/* Free a work share struct, if not orphaned, put it into current + team's free gomp_work_share cache. */ + +static inline void +free_work_share (struct gomp_team *team, struct gomp_work_share *ws) +{ + gomp_fini_work_share (ws); + if (__builtin_expect (team == NULL, 0)) + free (ws); + else + { + struct gomp_work_share *next_ws; +#ifdef HAVE_SYNC_BUILTINS + do + { + next_ws = team->work_share_list_free; + ws->next_free = next_ws; + } + while (!__sync_bool_compare_and_swap (&team->work_share_list_free, + next_ws, ws)); +#else + gomp_mutex_lock (&team->work_share_list_free_lock); + next_ws = team->work_share_list_free; + ws->next_free = next_ws; + team->work_share_list_free = ws; + gomp_mutex_unlock (&team->work_share_list_free_lock); +#endif + } +} /* The current thread is ready to begin the next work sharing construct. In all cases, thr->ts.work_share is updated to point to the new @@ -74,71 +173,34 @@ gomp_work_share_start (bool ordered) struct gomp_thread *thr = gomp_thread (); struct gomp_team *team = thr->ts.team; struct gomp_work_share *ws; - unsigned ws_index, ws_gen; /* Work sharing constructs can be orphaned. */ if (team == NULL) { - ws = gomp_new_work_share (ordered, 1); + ws = gomp_malloc (sizeof (*ws)); + gomp_init_work_share (ws, ordered, 1); thr->ts.work_share = ws; - thr->ts.static_trip = 0; - gomp_mutex_lock (&ws->lock); - return true; + return ws; } - gomp_mutex_lock (&team->work_share_lock); - - /* This thread is beginning its next generation. */ - ws_gen = ++thr->ts.work_share_generation; - - /* If this next generation is not newer than any other generation in - the team, then simply reference the existing construct. */ - if (ws_gen - team->oldest_live_gen < team->num_live_gen) + ws = thr->ts.work_share; + thr->ts.last_work_share = ws; + ws = gomp_ptrlock_get (&ws->next_ws); + if (ws == NULL) { - ws_index = ws_gen & team->generation_mask; - ws = team->work_shares[ws_index]; + /* This thread encountered a new ws first. */ + struct gomp_work_share *ws = alloc_work_share (team); + gomp_init_work_share (ws, ordered, team->nthreads); thr->ts.work_share = ws; - thr->ts.static_trip = 0; - - gomp_mutex_lock (&ws->lock); - gomp_mutex_unlock (&team->work_share_lock); - - return false; + return true; } - - /* Resize the work shares queue if we've run out of space. */ - if (team->num_live_gen++ == team->generation_mask) + else { - team->work_shares = gomp_realloc (team->work_shares, - 2 * team->num_live_gen - * sizeof (*team->work_shares)); - - /* Unless oldest_live_gen is zero, the sequence of live elements - wraps around the end of the array. If we do nothing, we break - lookup of the existing elements. Fix that by unwrapping the - data from the front to the end. */ - if (team->oldest_live_gen > 0) - memcpy (team->work_shares + team->num_live_gen, - team->work_shares, - (team->oldest_live_gen & team->generation_mask) - * sizeof (*team->work_shares)); - - team->generation_mask = team->generation_mask * 2 + 1; + thr->ts.work_share = ws; + return false; } - - ws_index = ws_gen & team->generation_mask; - ws = gomp_new_work_share (ordered, team->nthreads); - thr->ts.work_share = ws; - thr->ts.static_trip = 0; - team->work_shares[ws_index] = ws; - - gomp_mutex_lock (&ws->lock); - gomp_mutex_unlock (&team->work_share_lock); - - return true; } - /* The current thread is done with its current work sharing construct. This version does imply a barrier at the end of the work-share. */ @@ -147,36 +209,28 @@ gomp_work_share_end (void) { struct gomp_thread *thr = gomp_thread (); struct gomp_team *team = thr->ts.team; - struct gomp_work_share *ws = thr->ts.work_share; - bool last; - - thr->ts.work_share = NULL; + gomp_barrier_state_t bstate; /* Work sharing constructs can be orphaned. */ if (team == NULL) { - free_work_share (ws); + free_work_share (NULL, thr->ts.work_share); + thr->ts.work_share = NULL; return; } - last = gomp_barrier_wait_start (&team->barrier); + bstate = gomp_barrier_wait_start (&team->barrier); - if (last) + if (gomp_barrier_last_thread (bstate)) { - unsigned ws_index; - - ws_index = thr->ts.work_share_generation & team->generation_mask; - team->work_shares[ws_index] = NULL; - team->oldest_live_gen++; - team->num_live_gen = 0; - - free_work_share (ws); + if (__builtin_expect (thr->ts.last_work_share != NULL, 1)) + free_work_share (team, thr->ts.last_work_share); } - gomp_barrier_wait_end (&team->barrier, last); + gomp_team_barrier_wait_end (&team->barrier, bstate); + thr->ts.last_work_share = NULL; } - /* The current thread is done with its current work sharing construct. This version does NOT imply a barrier at the end of the work-share. */ @@ -188,15 +242,17 @@ gomp_work_share_end_nowait (void) struct gomp_work_share *ws = thr->ts.work_share; unsigned completed; - thr->ts.work_share = NULL; - /* Work sharing constructs can be orphaned. */ if (team == NULL) { - free_work_share (ws); + free_work_share (NULL, ws); + thr->ts.work_share = NULL; return; } + if (__builtin_expect (thr->ts.last_work_share == NULL, 0)) + return; + #ifdef HAVE_SYNC_BUILTINS completed = __sync_add_and_fetch (&ws->threads_completed, 1); #else @@ -206,18 +262,6 @@ gomp_work_share_end_nowait (void) #endif if (completed == team->nthreads) - { - unsigned ws_index; - - gomp_mutex_lock (&team->work_share_lock); - - ws_index = thr->ts.work_share_generation & team->generation_mask; - team->work_shares[ws_index] = NULL; - team->oldest_live_gen++; - team->num_live_gen--; - - gomp_mutex_unlock (&team->work_share_lock); - - free_work_share (ws); - } + free_work_share (team, thr->ts.last_work_share); + thr->ts.last_work_share = NULL; } |