Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(486)

Issue 6327043: SSE2 optimization for matrix computation when scaling bitmap

Can't Edit
Can't Publish+Mail
Start Review
Created:
13 years, 4 months ago by Lei
Modified:
13 years, 2 months ago
Reviewers:
DerekS, reed1
CC:
epoger
Base URL:
http://skia.googlecode.com/svn/trunk/
Visibility:
Public.

Description

The optimization routines are decal_filter_scale and decal_nofilter_scale. Below command is what I used to get the improvement on Android platform. ./skia_bench -config 8888 -match bitmap_8888_scale -repeat 200 w/o these opts: D/skia ( 1803): running bench [640 480] bitmap_8888_scale D/skia ( 1803): 8888: cmsecs = 55.02 D/skia ( 1803): running bench [640 480] bitmap_8888_scale_filter D/skia ( 1803): 8888: cmsecs = 162.63 w/ these opts: D/skia ( 4947): running bench [640 480] bitmap_8888_scale D/skia ( 4947): 8888: cmsecs = 53.40 D/skia ( 4947): running bench [640 480] bitmap_8888_scale_filter D/skia ( 4947): 8888: cmsecs = 157.67 Actually the benefits are not significant. I ran several times and the result is stable. The net gain is about 3%

Patch Set 1 #

Patch Set 2 : updated patch #

Patch Set 3 : updated the patch after changing the flag from USE_SSE2 to SK_CPU_SSE_LEVEL #

Patch Set 4 : changed the SSE2 check condition at compile time #

Patch Set 5 : updated SSE2 check condition at compile time #

Patch Set 6 : only include matrix optimizations #

Total comments: 1
Unified diffs Side-by-side diffs Delta from patch set Stats (+50 lines, -0 lines) Patch
bench/BitmapBench.cpp View 1 2 3 4 5 2 chunks +3 lines, -0 lines 0 comments Download
src/core/SkBitmapProcState_matrixProcs.cpp View 1 2 3 4 5 3 chunks +47 lines, -0 lines 1 comment Download

Messages

Total messages: 29
Lei
13 years, 4 months ago (2012-06-21 09:17:34 UTC) #1
reed1
The changes in the .cpp are pretty clear (though I can't read those intrinsics yet). ...
13 years, 4 months ago (2012-06-21 17:34:57 UTC) #2
Lei
Hi reed1, thanks for your suggestion. Actually, at the beginning, I also wanted to add ...
13 years, 3 months ago (2012-07-02 10:38:38 UTC) #3
Lei
updated patch
13 years, 3 months ago (2012-07-02 10:38:54 UTC) #4
reed1
I still don't see who sets USE_SSE2 (it is not automatically set when I test ...
13 years, 3 months ago (2012-07-02 14:38:23 UTC) #5
Lei
Sure. I will use this flag.
13 years, 3 months ago (2012-07-03 01:46:27 UTC) #6
Lei
updated the patch after changing the flag from USE_SSE2 to SK_CPU_SSE_LEVEL
13 years, 3 months ago (2012-07-03 01:47:12 UTC) #7
reed1
Sorry, the names have been churning lately. I think the right pattern for SSE2 is ...
13 years, 3 months ago (2012-07-03 12:10:49 UTC) #8
Lei
Sure. I understand and saw the definitions in SkPreConfig.h. I will update the patch.
13 years, 3 months ago (2012-07-04 04:17:47 UTC) #9
Lei
changed the SSE2 check condition at compile time
13 years, 3 months ago (2012-07-04 04:19:04 UTC) #10
Lei
On 2012/07/04 04:19:04, Lei wrote: > changed the SSE2 check condition at compile time sorry, ...
13 years, 3 months ago (2012-07-04 04:21:07 UTC) #11
Lei
updated SSE2 check condition at compile time
13 years, 3 months ago (2012-07-04 04:21:41 UTC) #12
reed1
Lei, the code looks good. Can you include any bench results that demonstrate that we ...
13 years, 3 months ago (2012-07-09 13:36:23 UTC) #13
Lei
Hi Reed, thanks. Actually, we previously did these optimizations based on Android platform. We ran ...
13 years, 3 months ago (2012-07-16 07:20:35 UTC) #14
reed1
Summarizing external benchmarks is good, but what the code-base really needs are benchmarks (either existing ...
13 years, 3 months ago (2012-07-16 14:09:41 UTC) #15
Lei
Hi Reed, thanks for your clarify. I understand your concern. I saw a little bit ...
13 years, 3 months ago (2012-07-17 08:00:01 UTC) #16
reed1
You summarized very well -- anytime we add assembly or other complexity to Skia, we ...
13 years, 3 months ago (2012-07-17 12:16:18 UTC) #17
Lei
After detailed check, it seems that there are already one optimized routine for ClampX_ClampY_filter_scale. That ...
13 years, 3 months ago (2012-07-25 03:22:46 UTC) #18
Lei
only include matrix optimizations
13 years, 2 months ago (2012-08-23 08:10:26 UTC) #19
Lei
and add the extra skia micro bench to demonstrate the benefit of decal_nofilter_scale
13 years, 2 months ago (2012-08-23 08:11:19 UTC) #20
Lei
Actually these two routines also have NEON version. However, they are in Android tree not ...
13 years, 2 months ago (2012-08-23 08:14:26 UTC) #21
Lei
Hi Reed, please help review if you get a chance. Thanks very much!
13 years, 2 months ago (2012-08-27 02:03:15 UTC) #22
reed1
Derek, are we missing some NEON code in android but not in skia?
13 years, 2 months ago (2012-08-27 12:51:04 UTC) #23
DerekS
On 2012/08/27 12:51:04, reed1 wrote: > Derek, are we missing some NEON code in android ...
13 years, 2 months ago (2012-08-27 13:00:53 UTC) #24
reed1
LGTM
13 years, 2 months ago (2012-08-27 13:28:34 UTC) #25
Lei
Hi DerekS, very sorry. Before I haven't realized the structure differences of MatrixProcs between the ...
13 years, 2 months ago (2012-08-28 07:33:47 UTC) #26
DerekS
I'm fine with keeping this code here for now as it should get inlined. https://codereview.appspot.com/6327043/diff/19001/src/core/SkBitmapProcState_matrixProcs.cpp ...
13 years, 2 months ago (2012-08-28 13:31:25 UTC) #27
Lei
Hi DerekS, you are right. I am totally OK to follow the example you did ...
13 years, 2 months ago (2012-08-29 02:29:47 UTC) #28
reed1
13 years, 2 months ago (2012-08-29 19:27:41 UTC) #29
Lei, you have been granted Committer status for Skia. You should be able to check this in yourself. Let me or Elliot (on cc) know if you have any problems.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b