Updating the recommendation for Executor GC heuristic in case of high executor GC #442
Conversation
| if (evaluator.severityTimeA.getValue > Severity.LOW.getValue) { | ||
| resultDetails = resultDetails :+ new HeuristicResultDetails("Gc ratio high", "The job is spending too much time on GC. We recommend increasing the executor memory.") | ||
| resultDetails = resultDetails :+ new HeuristicResultDetails("Gc ratio high", | ||
| "The job is spending too much time on GC. Recommended to increase the executor memory and also can enable ParallelGC using spark.executor.extraJavaOptions or reducing number of UDF calls.") |
There was a problem hiding this comment.
Thanks for fixing this so quickly. For a more customized recommendation, could this check the value for "spark.executor.extraJavaOptions", to see if ParallelGC is already set? I don't think there's an easy way to check for UDFs unfortunately.
For LinkedIn, it would also be nice to include a link to the wiki: https://iwww.corp.linkedin.com/wiki/cf/display/DWH/Big+Data+Engineering+-+Advanced+Spark+SQL+Tuning+Techniques#BigDataEngineering-AdvancedSparkSQLTuningTechniques-GCtuning
Including the link would not work for open source, and there is also the danger of the link changing.
Another option is to provide the details mentioned in the wiki as part of the html explanation/recommendations.
There was a problem hiding this comment.
Yes check for spark.executor.extraJavaOptions will be good, but for adding this link I need to check if this is possible to represent it as a proper html link on the heuristic page.
There was a problem hiding this comment.
Discussed with Min and Fangshi, and the current plan is to add ParallelGC as part of the cluster configuration, in which case we don't need to list this as a recommendation, at least for LinkedIn -- it could still be useful for outside users.
| } | ||
|
|
||
| val isParallelGcEnabled: Boolean = appConfigurationProperties.getOrElse("spark.executor.extraJavaOptions","").contains("XX:+UseParallelGC") | ||
| val parallelGcRecommendation: String = if (isParallelGcEnabled) "" else "Enable ParallelGc using spark.executor.extraJavaOptions." |
There was a problem hiding this comment.
Change "ParallelGc" to "ParallelGC", in case of copy-paste.
6c4a047 to
afd36d6
Compare
| <p>We recommend increasing the executor memory.</p> | ||
| <p>Enabling ParallelGC using spark.executor.extraJavaOptions could help.</p> | ||
| <p>Also recommended to reduce the number of UDF calls.</p> | ||
| <p>For more help refer <a href="https://iwww.corp.linkedin.com/wiki/cf/display/DWH/Spark+SQL+Tuning+Techniques" target="_blank">here</a></p> No newline at end of file |
There was a problem hiding this comment.
This is a LinkedIn internal page -- is this OK to add to Dr. Elephant?
There was a problem hiding this comment.
Other thoughts are adding the information to the html detailed recommendations, or creating a wiki on the external Dr. Elephant wiki.
|
Looks good, thanks! |
fc7de94 to
ffe4d17
Compare
a080131 to
73795ce
Compare
|
@edwinalu thanks for the review. There were some merge conflicts introduced to some efforts to merge this branch with master. I have resolved the conflicts, if feasible kindly have a look through the changes and let me know if I should merge the changes. |
Description
Adding some more recommendations in ExecutorGC heuristic when high GC is witnessed in the application.