Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(124)

Issue 4591045: [google] Patch to add compiler flag to dump callgraph edge profiles in special .note sections

Can't Edit
Can't Publish+Mail
Start Review
Created:
14 years, 5 months ago by Sriraman
Modified:
14 years, 5 months ago
Reviewers:
davidxl
CC:
gcc-patches_gcc.gnu.org
Base URL:
svn+ssh://gcc.gnu.org/svn/gcc/branches/google/main/gcc/
Visibility:
Public.

Patch Set 1 #

Unified diffs Side-by-side diffs Delta from patch set Stats (+63 lines, -2 lines) Patch
M common.opt View 1 chunk +4 lines, -0 lines 0 comments Download
M doc/invoke.texi View 2 chunks +10 lines, -1 line 0 comments Download
M final.c View 2 chunks +40 lines, -1 line 0 comments Download
M params.def View 1 chunk +9 lines, -0 lines 0 comments Download

Messages

Total messages: 4
Sriraman
Patch Description: ================= I am working on a project to do global function layout in ...
14 years, 5 months ago (2011-06-08 02:05:06 UTC) #1
Sriraman
+davidxl On Tue, Jun 7, 2011 at 7:05 PM, Sriraman Tallam <tmsriram@google.com> wrote: > Patch ...
14 years, 5 months ago (2011-06-08 16:13:34 UTC) #2
davidxl
ok for google/main. David On Wed, Jun 8, 2011 at 9:13 AM, Sriraman Tallam <tmsriram@google.com> ...
14 years, 5 months ago (2011-06-08 16:16:11 UTC) #3
Sriraman
14 years, 5 months ago (2011-06-08 16:30:04 UTC) #4
On Wed, Jun 8, 2011 at 9:16 AM, Xinliang David Li <davidxl@google.com> wrote: > ok for google/main. Thanks, the patch is now committed. > > David > > On Wed, Jun 8, 2011 at 9:13 AM, Sriraman Tallam <tmsriram@google.com> wrote: >> +davidxl >> >> On Tue, Jun 7, 2011 at 7:05 PM, Sriraman Tallam <tmsriram@google.com> wrote: >>> Patch Description: >>> ================= >>> >>> I am working on a project to do global function layout in the linker where the linker reads the callgraph edge profile information, generated by FDO, and uses that to find a ordering of functions that will place functions calling each other frequently closer, like the Pettis-Hansen code ordering algorithm described in the paper "Profile-guided Code Poisitioning" in PLDI 1990. >>> >>> This patch adds a flag that allows the callgraph edge profile information to be stored .note sections called ".note.callgraph.text". The new compiler flag -fcallgraph-profiles-sections generates these sections and must be used along with -fprofile-use. I have added a PARAM to only output callgraph edges greater than a specified threshold. Once this is available, the linker can read these sections and generate a global callgraph which can be used to determine a global function ordering. >>> >>> I am adding plugin support in the gold linker to allow linker plugins to be able to read the contents of sections and also adding plugin hooks to specify a desired ordering of functions to the linker. The linker patch is available here : http://sourceware.org/ml/binutils/2011-03/msg00043.html. Once this is available, linker plugins can be used to determine the function layout, like the Pettis-Hansen algorithm, of the final binary. >>> >>> Example: The new .note.callgraph.text sections looks like this for a function foo that calls bar 100 times and zap 50 times: >>> **************************** >>> .section        .note.callgraph.text._Z3foov,"",@progbits >>>        .string "Function _Z3foov" >>>        .string "_Z3barv" >>>        .string "100" >>>        .string "_Z3zapv" >>>        .string "50" >>> *************************** >>> >>> For now, this is for google/main. I will re-submit for review to trunk along with data layout. >>> >>> Google ref 41940 >>> >>> 2011-06-07  Sriraman Tallam  <tmsriram@google.com> >>> >>>        * doc/invoke.texi: document option -fcallgraph-profiles-sections. >>>        * final.c  (dump_cgraph_profiles): New function. >>>        (rest_of_handle_final): Create new section '.note.callgraph.text' >>>        with compiler flag -fcallgraph-profiles-sections >>>        * common.opt: New option -fcallgraph-profiles-sections. >>>        * params.def (DEFPARAM): New param >>>        PARAM_NOTE_CGRAPH_SECTION_EDGE_THRESHOLD. >>> >>> Index: doc/invoke.texi >>> =================================================================== >>> --- doc/invoke.texi     (revision 174789) >>> +++ doc/invoke.texi     (working copy) >>> @@ -351,7 +351,7 @@ Objective-C and Objective-C++ Dialects}. >>>  -falign-labels[=@var{n}] -falign-loops[=@var{n}] -fassociative-math @gol >>>  -fauto-inc-dec -fbranch-probabilities -fbranch-target-load-optimize @gol >>>  -fbranch-target-load-optimize2 -fbtr-bb-exclusive -fcaller-saves @gol >>> --fcheck-data-deps -fclone-hot-version-paths @gol >>> +-fcallgraph-profiles-sections -fcheck-data-deps -fclone-hot-version-paths @gol >>>  -fcombine-stack-adjustments -fconserve-stack @gol >>>  -fcompare-elim -fcprop-registers -fcrossjumping @gol >>>  -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol >>> @@ -8114,6 +8114,15 @@ Do not promote static functions with always inline >>>  @opindex fripa-verbose >>>  Enable printing of verbose information about dynamic inter-procedural optimizations. >>>  This is used in conjunction with the @option{-fripa}. >>> + >>> +@item -fcallgraph-profiles-sections >>> +@opindex fcallgraph-profiles-sections >>> +Emit call graph edge profile counts in .note.callgraph.text sections. This is >>> +used in conjunction with @option{-fprofile-use}. A new .note.callgraph.text >>> +section is created for each function. This section lists every callee and the >>> +number of times it is called. The params variable >>> +"note-cgraph-section-edge-threshold" can be used to only list edges above a >>> +certain threshold. >>>  @end table >>> >>>  The following options control compiler behavior regarding floating >>> Index: final.c >>> =================================================================== >>> --- final.c     (revision 174789) >>> +++ final.c     (working copy) >>> @@ -4321,13 +4321,37 @@ debug_free_queue (void) >>>       symbol_queue_size = 0; >>>     } >>>  } >>> - >>> + >>> +/* List the call graph profiled edges whise value is greater than >>> +   PARAM_NOTE_CGRAPH_SECTION_EDGE_THRESHOLD in the >>> +   ".note.callgraph.text" section. */ >>> +static void >>> +dump_cgraph_profiles (void) >>> +{ >>> +  struct cgraph_node *node = cgraph_node (current_function_decl); >>> +  struct cgraph_edge *e; >>> +  struct cgraph_node *callee; >>> + >>> +  for (e = node->callees; e != NULL; e = e->next_callee) >>> +    { >>> +      if (e->count <= PARAM_VALUE (PARAM_NOTE_CGRAPH_SECTION_EDGE_THRESHOLD)) >>> +        continue; >>> +      callee = e->callee; >>> +      fprintf (asm_out_file, "\t.string \"%s\"\n", >>> +               IDENTIFIER_POINTER (decl_assembler_name (callee->decl))); >>> +      fprintf (asm_out_file, "\t.string \"" HOST_WIDEST_INT_PRINT_DEC "\"\n", >>> +               e->count); >>> +    } >>> +} >>> + >>>  /* Turn the RTL into assembly.  */ >>>  static unsigned int >>>  rest_of_handle_final (void) >>>  { >>>   rtx x; >>>   const char *fnname; >>> +  char *profile_fnname; >>> +  unsigned int flags; >>> >>>   /* Get the function's name, as described by its RTL.  This may be >>>      different from the DECL_NAME name used in the source file.  */ >>> @@ -4387,6 +4411,21 @@ rest_of_handle_final (void) >>>     targetm.asm_out.destructor (XEXP (DECL_RTL (current_function_decl), 0), >>>                                decl_fini_priority_lookup >>>                                  (current_function_decl)); >>> + >>> +  /* With -fcgraph-section, add ".note.callgraph.text" section for storing >>> +     profiling information. */ >>> +  if (flag_callgraph_profiles_sections >>> +      && flag_profile_use >>> +      && cgraph_node (current_function_decl) != NULL) >>> +    { >>> +      flags = SECTION_DEBUG; >>> +      asprintf (&profile_fnname, ".note.callgraph.text.%s", fnname); >>> +      switch_to_section (get_section (profile_fnname, flags, NULL)); >>> +      fprintf (asm_out_file, "\t.string \"Function %s\"\n", fnname); >>> +      dump_cgraph_profiles (); >>> +      free (profile_fnname); >>> +    } >>> + >>>   return 0; >>>  } >>> >>> Index: common.opt >>> =================================================================== >>> --- common.opt  (revision 174789) >>> +++ common.opt  (working copy) >>> @@ -907,6 +907,10 @@ fcaller-saves >>>  Common Report Var(flag_caller_saves) Optimization >>>  Save registers around function calls >>> >>> +fcallgraph-profiles-sections >>> +Common Report Var(flag_callgraph_profiles_sections) Init(0) >>> +Generate .note.callgraph.text sections listing callees and edge counts. >>> + >>>  fcheck-data-deps >>>  Common Report Var(flag_check_data_deps) >>>  Compare the results of several data dependence analyzers. >>> Index: params.def >>> =================================================================== >>> --- params.def  (revision 174789) >>> +++ params.def  (working copy) >>> @@ -1002,6 +1002,15 @@ DEFPARAM (PARAM_MVERSN_CLONE_CGRAPH_DEPTH, >>>          "maximum length of the call graph path to be cloned " >>>           "while doing multiversioning", >>>          2, 0, 5) >>> + >>> +/* Only output those call graph edges in .note.callgraph.text sections >>> +   whose count is greater than this value. */ >>> +DEFPARAM (PARAM_NOTE_CGRAPH_SECTION_EDGE_THRESHOLD, >>> +         "note-cgraph-section-edge-threshold", >>> +         "minimum call graph edge count for inclusion in " >>> +          ".note.callgraph.text section", >>> +         0, 0, 0) >>> + >>>  /* >>>  Local variables: >>>  mode:c >>> >>> -- >>> This patch is available for review at http://codereview.appspot.com/4591045 >>> >> > 
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b