INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: NEW: Faster image / user interface with SSE 4.1  (Read 6604 times)

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42389
  • Shoes gone again!
NEW: Faster image / user interface with SSE 4.1
« on: March 16, 2012, 07:43:19 pm »

The SSE image code requires SSE 4.1, which I think means Core 2 or newer.

If you have a CPU without SSE 4.1, please let me know if you see any problems (you shouldn't unless our SSE detection isn't quite right).

The core image code is some of the most time critical in the program, since it underpins all the user interface (Standard View and Theater View).

Here's my image benchmark from a Sandy Bridge:

17.0.108

Running 'Image' benchmark...
    Image creation / destruction... 0.687 seconds
    Flood filling... 0.334 seconds
    Direct copying... 0.595 seconds
    Small renders... 1.731 seconds
    Bilinear rendering... 0.902 seconds
    Bicubic rendering... 0.814 seconds
Score: 4345

17.0.109

Running 'Image' benchmark...
    Image creation / destruction... 0.644 seconds
    Flood filling... 0.328 seconds
    Direct copying... 0.595 seconds
    Small renders... 1.281 seconds
    Bilinear rendering... 0.768 seconds
    Bicubic rendering... 0.892 seconds
Score: 4880

17.0.110

Running 'Image' benchmark...
    Image creation / destruction... 0.719 seconds
    Flood filling... 0.334 seconds
    Direct copying... 0.595 seconds
    Small renders... 1.064 seconds
    Bilinear rendering... 0.766 seconds
    Bicubic rendering... 0.624 seconds
Score: 5361

17.0.111

Running 'Image' benchmark...
    Image creation / destruction... 0.686 seconds
    Flood filling... 0.327 seconds
    Direct copying... 0.596 seconds
    Small renders... 1.033 seconds
    Bilinear rendering... 0.710 seconds
    Bicubic rendering... 0.623 seconds
Score: 5536

27.4% faster overall

I think there's a little more performance left to find, although I'm not sure when we'll find it.

Proper support of partial alpha makes the algorithms a lot tougher, because you end up having to deal with colors and alpha independently so weird things like drawing from a transparent swatch of a funny color doesn't bleed or fade that color into the output, etc.
Logged
Matt Ashland, JRiver Media Center

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42389
  • Shoes gone again!
NEW: Faster image / user interface with SSE 4.1
« Reply #1 on: March 16, 2012, 07:49:01 pm »

Also, just a little note that bicubic is actually something better than bicubic.  I'm just not sure what to call it.

Intel calls it supersampling, but that's a term used for anti-aliasing, so that seems confusing.

More here:
http://yabb.jriver.com/interact/index.php?topic=68213.0
Logged
Matt Ashland, JRiver Media Center

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
NEW: Faster image / user interface with SSE 4.1
« Reply #2 on: March 16, 2012, 07:56:49 pm »


V108
Running 'Image' benchmark...
    Image creation / destruction... 1.144 seconds
    Flood filling... 0.722 seconds
    Direct copying... 0.735 seconds
    Small renders... 2.879 seconds
    Bilinear rendering... 1.728 seconds
    Bicubic rendering... 1.453 seconds
Score: 2540

V109
Running 'Image' benchmark...
    Image creation / destruction... 1.102 seconds
    Flood filling... 0.718 seconds
    Direct copying... 0.741 seconds
    Small renders... 2.231 seconds
    Bilinear rendering... 1.511 seconds
    Bicubic rendering... 1.665 seconds
Score: 2761

The scores more around alot for me (could be I have a bit running when testing)
Logged
JRiver CEO Elect

Hendrik

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 10945
NEW: Faster image / user interface with SSE 4.1
« Reply #3 on: March 17, 2012, 11:48:41 am »

Intel calls it supersampling, but that's a term used for anti-aliasing, so that seems confusing.

Supersampling is a pretty generic term for the process of basically increasing the size of an image then applying some processing and shrinking the image to the desired size.
Anti-Aliasing just coined the term, really.
Logged
~ nevcairiel
~ Author of LAV Filters

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42389
  • Shoes gone again!
NEW: Faster image / user interface with SSE 4.1
« Reply #4 on: March 17, 2012, 03:33:19 pm »

Supersampling is a pretty generic term for the process of basically increasing the size of an image then applying some processing and shrinking the image to the desired size.
Anti-Aliasing just coined the term, really.

Is there a standard name for a full-quality shrink that doesn't skip pixels?

Nearest neighbor uses 1 pixel.

Bilinear uses 4 pixels.

Bicubic uses 16 pixels.

What's the name of using all the pixels necessary (so a big shrink might look at 1000 or more source pixels to build each output pixel)?
Logged
Matt Ashland, JRiver Media Center

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42389
  • Shoes gone again!
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #5 on: March 19, 2012, 02:46:55 pm »

Some more speed is coming next build.

The JRMark of the image portion on my work machine looks like this (fastest of 3 runs):

MMX (17.0.108 and earlier): 4341
SSE 4.1: 5115
17.8% faster
Logged
Matt Ashland, JRiver Media Center

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42389
  • Shoes gone again!
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #6 on: March 19, 2012, 03:52:51 pm »

And some more speed.

The JRMark of the image portion on my work machine looks like this (fastest of 3 runs):

MMX (17.0.108 and earlier): 4341
SSE 4.1: 5427
25.0% faster
Logged
Matt Ashland, JRiver Media Center

Alex B

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 10121
  • The Cosmic Bird
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #7 on: March 19, 2012, 07:05:49 pm »

Do you guys always get reproducible JRMark results?

On my 3.1 GHz quad-core AMD Phenom II the JRMark scores seem to always be all over the place. Subsequent tests produce wildly varied results. A couple of days ago I wanted to try the image performance "before" and "after". Here is what I got when I tried eight subsequent tests using 17.0.107 and 17.0.109:

Version 17.0.107
Running 'Image' benchmark...
Score: 1401
Score: 1657
Score: 1338
Score: 1497
Score: 1414
Score: 1504
Score: 1626
Score: 1455

Version 17.0.109
Running 'Image' benchmark...
Score: 1576
Score: 1434
Score: 1528
Score: 1594
Score: 1523
Score: 1650
Score: 1597
Score: 1638

Before testing I closed all other programs and unnecessary background processes. I had the ASUS EPU-4 power saving engine running, but it was set to the "high Performance" mode which should disable all power saving features.

I really can't say if the perfomance of 109 is any different from 107. This oddiness is not limited to the image performance. All other scores are similarly "unstable". In general the scores seem to be low when compared to those that some more or less comparable Intel CPUs produce, but perhaps the compiler favors Intel.

I attached the complete JRMark results. I have not yet tried 110.
Logged
The Cosmic Bird - a triple merger of galaxies: http://eso.org/public/news/eso0755

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42389
  • Shoes gone again!
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #8 on: March 19, 2012, 07:22:45 pm »

JRMark scores will vary a little since it isn't a long test, but assuming the system isn't loaded, scores are normally within 50 or 100 points each run.

Also, the new image performance is only available if you have SSE 4.1.  I think this means you'll need a Core Duo, i5/i7, or AMD Bulldozer.
Logged
Matt Ashland, JRiver Media Center

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42389
  • Shoes gone again!
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #9 on: March 19, 2012, 07:30:19 pm »

In general the scores seem to be low when compared to those that some more or less comparable Intel CPUs produce, but perhaps the compiler favors Intel.

We don't intentionally favor Intel or AMD.

But Intel currently makes the fastest chips, so that's what we use in development.  When compiling, you can never have enough CPU power.

The side effect of this is that the vendor that makes the fastest CPUs gets their CPUs used when we tune our algorithms.
Logged
Matt Ashland, JRiver Media Center

Alex B

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 10121
  • The Cosmic Bird
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #10 on: March 19, 2012, 08:05:00 pm »

Quote
JRMark scores will vary a little since it isn't a long test, but assuming the system isn't loaded, scores are normally within 50 or 100 points each run.

My complete JRMark results varied in 16 tests from 1188 to 1538.

I think this motherboard has some power saving options in the BIOS (which are controlled by the EPU-4 program). I could try to disable them in the BIOS to see if that makes any difference.
Logged
The Cosmic Bird - a triple merger of galaxies: http://eso.org/public/news/eso0755

Alex B

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 10121
  • The Cosmic Bird
Logged
The Cosmic Bird - a triple merger of galaxies: http://eso.org/public/news/eso0755

Hendrik

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 10945
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #12 on: March 20, 2012, 01:56:33 am »

I doubt they are using Intels compiler, though. More likely they are using Microsofts compiler.
Logged
~ nevcairiel
~ Author of LAV Filters

leezer3

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 1588
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #13 on: March 20, 2012, 11:10:44 am »

This may or may not be related, but I'm seeing some artifacts when resizing a MC window quickly (Very obvious with a detached display). Looks like a bouncing copy of the window border for a couple of seconds.
Catalyst 12.3 beta, this started about when you made these changes, but also when I installed these drivers.....

Can try rolling back one or the other if you feel this is likely to be related.

-Leezer-
Logged

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42389
  • Shoes gone again!
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #14 on: March 21, 2012, 05:39:06 pm »

I've updated the benchmarks at the top to show the latest build.  27.4% faster overall, and 31.4% faster at core rendering (the last three numbers in the benchmark).
Logged
Matt Ashland, JRiver Media Center

JustinChase

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3276
  • Getting older every day
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #15 on: March 21, 2012, 05:45:16 pm »

27.4% faster overall, and 31.4% faster at core rendering

I have to admit that this impresses me.  After a dozen years, and consistent changes to improve speed that entire time, you just added another quarter to one-third speed boost.  You didn't just eek out a 3-5% increase; nope, you wrang out another 25%+

Good work!
Logged
pretend this is something funny

rjm

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 2699
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #16 on: March 21, 2012, 08:33:51 pm »

Well done! You've motivated me to update my system.
Logged

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42389
  • Shoes gone again!
Re: NEW: Faster image / user interface with SSE 4.1
« Reply #17 on: March 28, 2012, 12:15:28 pm »

bump
Logged
Matt Ashland, JRiver Media Center
Pages: [1]   Go Up