INTERACT FORUM
Devices => PC's and Other Hardware => Topic started by: Matt on January 25, 2013, 05:45:54 pm
-
We're evaluating a new build process for Media Center that's a little slower but possibly more error resistant.
My work machine has a JRMark of about 5100. It's an overclocked i7 2600k. Everything is on SSDs (except media).
Is there anything available that would be considerably faster, even a Xeon / server machine? ECC memory would be a plus.
Water cooling or hard core overclocking would be a little too much.
My understanding is that an Ivy Bridge would only be worth 10-15%.
Is there anything coming down the pipe?
Thanks for any advice.
-
Depends on how much you want to spend and how parallelised you build process is TBQH. You're probably about right with the Ivybridge figure, and that'll apply with most desktop chips.
I don't know what your budget is, but I'd be seriously considering a dual or quad CPU Xenon machine, so giving you either 8 or 16 cores to play with :)
That'd also start having implications for what version of Windows you can run mind, I *think* the standard versions are only licenced for a single CPU socket, hence you'd have to invest in a server OS.
Basically then depends on your budget; Speccing a fully decked out dual socket Xenon server would cost you somewhere in the region of $5,000, but assuming a nicely parallel build process you'd probably get a 40-60% boost.
Almost sounds like you need to hire a pro with the appropriate equipment to find your bottlenecks- No use adding more cores if something bottlenecks and holds up the whole build process.
-Leezer-
-
Compilations of large programs are almost always CPU-bound.
I remember changing our old Unix builds system. It used to take hours to compile the kernel. With parallel compilation units, it went down to about 4 minutes. Blazing.
-
I agree it's mostly CPU. Although SSDs help a lot because of the huge amount of intermediate data that gets written to disk.
The build process parallelizes pretty well.
It compiles 8 projects at a time for the ~110 projects of a build.
However, I think it might get to a point where it's spending a lot of time on one big static library (the main part of MC) with nothing else running. If we broke that library up or at least made sure it started first, it might help.
It might also be interesting to run the 32-bit and 64-bit legs of the compile at the same time. Is there any easy way with a batch file to get two things running at once and still catch the return code and handle it? Or is that a job for a custom program at that point?
-
Is there anything coming down the pipe?
Thanks for any advice.
Wait for a few months after Haswell ships (June) and they'll ship the Ivy-based Xeons (the current ones are based on Sandy Bridge-E).
Do that, and get a dual CPU workstation. Assuming the build process is sufficiently parallel (I think they usually are), that might provide some nice returns. For consumer-space boards? No, absolutely wait for Haswell, and consider the "high-end" line of CPUs if memory bandwidth matters.
-
No, absolutely wait for Haswell, and consider the "high-end" line of CPUs if memory bandwidth matters.
I've read Haswell will be around 10% faster per clock and clocked about the same speeds. Have you seen anything more encouraging?
Will it overclock better or offer 6 core versions or something?
The power consumption stuff the've done is awesome, but it's not exciting from a straight-line speed perspective.
-
Right, but you're on Sandy, and Haswell is likely to be another 10-15% bump over Ivy. So, then you're in 20-25% range of IPC improvement, which is substantial (particularly look at the reviews when they come out for certain workloads and see if they apply). Also, if you're interested in a Workstation dual-cpu system, it just wouldn't make sense to buy now, when they're still selling Sandys there (and you already have one). They're better Sandys, but still Sandys. Of course, they do have six and eight core versions.
But also...
The power consumption stuff the've done is awesome, but it's not exciting from a straight-line speed perspective.
Improved power consumption on the same process node == better overclocking.
Ivy is a process shrink, which gives efficiency gains just by it's very nature. But, they're not good at the process yet (and keep in mind, of course, Intel does it first). Haswell, like Sandy before it, is an "architecture node" in their tick/tock release cycle. This means that they've now gotten good at a particular process node, and have already figured out the best way to optimize for yields and top binnings.
This means, better overclocking (usually, unless they hit issues with the new design, and Intel doesn't often screw up).
Very generally speaking, Sandy overclocks better than Ivy. Ivy has some IPC advantages, and the GPU is way better (they seem to be alternating the GPU tick/tocks with the CPU cores), but it doesn't overclock very well. Since they started this new cadence, the Tocks (new architecture) have tended to be the ones you want for overclocking, because they are the optimized releases on a particular node.
-
Compiling is multi-threaded, so you could greatly benefit from a multi-CPU setup, so grab Ivy Xeons once they are available. :)
I wish Intel would start offering 6 core CPUs in their mainline (ie. non-E lines) setup, i would get one even if its rather pricy.
-
Compilation speed basically comes down to raw PC speed, how effectively your build environment takes advantage of parallel processing, and sometimes the hard disk speed
I have used SCONS before as my build environment and it did a really great job at splitting compilation into parallel tasks automatically. Other build tools also have similar parallel compilation support but my experience was mainly with SCONS (it's especially great if you already know Python).
It might be a fairly daunting task to recreate the MC18 build environment in SCONS or other build environment but it also will give you the benefit of helping to standardize the build environment across multiple operating systems.
-
Breathe .... just lost half an hour typing due to hitting the wrong key :'(
I'll be back later.
-
I've just run JRMark on my 16GB Ivy 3770k system running @4.5Ghz. It has a Vertex4 512GB system drive. It should be pretty speedy, yet it only scores around 4900 JRMark. So that's a pretty speedy Sandy Bridge system you have there!
Probably your best speedup now might be to go for a 6-core Sandy Bridge 3960X, which might be as simple as swapping the chip on your current system: http://www.anandtech.com/show/5091/intel-core-i7-3960x-sandy-bridge-e-review-keeping-the-high-end-alive/5
-
Might want to see this too: Building The Linux Kernel In 60 Seconds (http://www.phoronix.com/scan.php?page=news_item&px=MTAyNjU)
-
As said above, compiling heavily depends on cpu power and because its so heavily multithreaded, the more cores the happier it gets.
You're coming from a Core i7 2600K, which is a 4-core processor supporting hyperthreading. I don't think HT adds much benefit to compiling unless the compiler is optimized to use HT. In some cases, disabling HT can even make certain tasks faster (and allows for higher overclocks and a cooler running cpu). The Core i7 is clocked at 3.4Ghz and although there are faster cpu's, even adding a single core would benefit you more than increasing clockspeeds (4 x 600 = 2.4Ghz, an extra core adds 3.8Ghz). So, based on that adding cores is your ticket to heaven.
You mentioned server hardware, but I can tell you the common server hardware from Dell or HP will be slower than what you can build yourself for less. That stuff is built to be reliable, not to be the fastest kid on the block. You say ECC would be a plus but I think you're wrong. Buffered ECC is quite a bit slower and although it sounds good, I can't remember the last time I had to have memory replaced in one of our servers. I remember it happened once and even that time it turned out to be the mainboard. Mind you I'm working in a datacenter with over a 1000 systems and I've been here since 2004. Some of that hardware is still running. Physical memory failures are rare and if there are issues due to memory, its because its incompatible with the mainboard, its incorrectly configured, incompatible sticks are stuck together or its running too hot. Solve those, run a 24-hour stresstest and you can very certain memory won't be an issue. Manufacturers can give lifetime warranty on their sticks - they don't do that if its even remotely likely to fail.
CPU's currently available that would add a significant speed bump
Intel Xeon E5-2687W, 8-core, HT, 3.1Ghz/3.8Ghz
Intel Xeon E5-2690, 8-core, HT, 2.9Ghz/3.8Ghz
Intel Core i7 3970X, 6-core, HT, 3.5Ghz/4Ghz
Whether those are an in-place upgrade to your current cpu I don't know, but we want to do things properly so:
Dual socket mainboards
Intel DBS2600CP4, dual socket 2011, 16x DDR3, C602 chipset
Asus Z9PE-D8 WS, dual socket 2011, 8x DDR3, C602 chipset
Supermicro X9DA7, dual socket 2011, 16x DDR3, C602 chipset
All three CPU's are Sandy's because we determined you need more cores and there are no 8-core or even 6-core Ivy's available. Intel has promised a 6-core i7 Ivy for the 2nd half of 2012, but until now we've seen none and no word from Intel either unless I missed something. Either way, they don't exist. They may not even come any time soon with Haswell scheduled for 3Q/4Q this year until Intel has something in the Haswell pipeline to match it.
Reasons not to wait for Haswell
Intel is promising a 10-15% speedbump from Haswell over Ivy. Ivy promised a similar speedbump over Sandy. But practically what we've seen is more around 5% (http://www.overclock.net/t/1242313/more-ivy-bridge-benchmarks-sandybridge-comparison-3770k-vs-2600k-performance-temps-etc-couple-of-ln2-scores-are-up/0_50). Expect something similar for Haswell. Realisticly we should expect around 10% over Sandy from Haswell. Intel's predictions have always been too optimistic and its basically just marketing bloat. Adding cores will provide a bigger speed increase, something that can be done with Sandy, but not with Ivy or Haswell.
Haswell isn't even here. Haswell may not even come may/june. It may be a big dissapointment at which point Sandy's may even rise in cost because of a higher demand. We can expect the same thing we've seen with every new cpu Intel introduced: ~5% speed increase and similar to the mainstream clockspeeds. There won't be 6 or 8-cores, there won't be extreme clockspeeds. Haswell won't beat the fastest Ivy's or even the fastest Sandy's in the first 6 months after release. Even if intel could deliver something like that right off the bat, they won't because its not the smart thing to do. They need something new in 6 to 12 months. They want people to buy at introduction and upgrade later, not buy now and not upgrade in 2 years. If you want to wait for a Haswell that can compete with 12 or 16 cores Sandy violence, you may need to wait a year and a half, maybe even two.
Another reason not to wait for Haswell could be the Ivy rumor mill (http://www.engadget.com/2012/10/17/intel-roadmap-reveals-10-core-xeon-e5-2600-v2-cpu/). Rumors have it that Intel is planning a 10-core Ivy Xeon supporting HT, 30MB cache and 1866Mhz DDR3, compatible with current socket 2011 boards. Dare I say it even mentioned a 12-core in the pipeline? Imagine dual socket, 24-core @ 4Ghz. I can assure you no Haswell in the coming years will beat that.
Reasons to wait for Haswell
You could just upgrade to an i7 3970X now and wait for Haswell. It's most likely an in-place upgrade for your current cpu (but check your mainboard supports it!) and its relatively cheap compared to the other options available right now. It gives you 2 more cores and theoretically more than 50% speed bump (2 more cores and 600Mhz per core speed increase). Haswell will very likely bring much better power efficiency. If you wait, you get a new socket, new chipset, PCI-E 3.0 and-I-dont-know-what-else. Good stuff. Besides, the socket 2011 is not very future proof anymore. It will give you 12 or 16 cores in a dual socket setup and with that, a massive speed increase over what you have right now, but you only get to do it once. Well, maybe perhaps when/if the 10 or 12 core Ivy rumors are true and really fit in the 2011 boards I listed above you could do it again, but that is a major gamble because if they don't show up or don't fit on those boards, then you're stuck with it to replace it all again for a Haswell platform.
To sum it up (http://To sum it up)
- Do nothing. Wait for Haswell. Yawn.
- Upgrade your current system with an Intel Core i7 3970X. Relatively cheap and provides about 50% speed increase, give or take. Wait for Haswell.
- Get a new dual socket mainboard and put 2 6-core Intel Core i7 3970X on it. Not so cheap anymore but will likely be more than twice as fast as your current system.
- Get a new dual socket mainboard and put 2 8-core Xeons on it. Rule the world hard and fast. The most expensive option, but like 4 times as fast as what you have now.
Note: The speed increases I estimate are linear because I assume compiling scales linearly with both cpu speed and extra cores. It may not work out that way because 1) the assumption is incorrect, 2) there are other unforeseen hardware limitations or 3) both 1 and 2 apply ::)
Other things to consider
Although storage may not be a limiting factor in throughput, when you build a system around a dual socket with 12 or 16 cores (and possibly even 20 or 24) the sheer number of I/O requests and interrupts can become a bottleneck on another component like the controller if all disks are connected to it. Spreading disks over multiple controllers and separate buses would help prevent such issues. Things like looking at your workflow and optimizing your disk layout to match can help so that you're not reading and writing to the same disk, or so that your tempprary files during compilation are not on the same disk as the source or destination.
I bet the same can be said for memory as well. For normal situations in regular desktop work, having faster memory doesn't really pay off in terms of a noticable performance boost. But having so many cores running threads poking around in memory simultaneously may show bottlenecks that normally never show. What might help here is not just higher clocked RAM, but lower CAS latencies. Typically, the cheaper sticks that boost high clockspeeds have also very high CAS latencies but lower latencies might provide better performance in this case.
-
If going the 6-core route it seems sensible to compare 3960X and 3970X and buy the cheaper. Both are SB parts and the difference seems to be only the default clock speed. Any 3960X part will overclock way beyond the off the shelf 3970X speed and there are reports that some 3960X parts overclock better than some 3970X parts. Basically they are interchangeable and best to buy whichever is cheaper!
IM makes a good point about investing in plenty of good, fast, low latency RAM.
-
Thanks for all the advice everyone.
This thread got me looking into the concurrency of compiling a little more. I noticed that Visual Studio can now use multiple cores to compile a single project:
http://msdn.microsoft.com/en-us/library/vstudio/bb385193.aspx
Currently we compile multiple projects at a time, but don't enable multi-processor compiling of a single project. Hopefully turning that on for the bigger projects (or maybe all of them) will win us some additional performance.
-
Another reason not to wait for Haswell could be the Ivy rumor mill (http://www.engadget.com/2012/10/17/intel-roadmap-reveals-10-core-xeon-e5-2600-v2-cpu/). Rumors have it that Intel is planning a 10-core Ivy Xeon supporting HT, 30MB cache and 1866Mhz DDR3, compatible with current socket 2011 boards. Dare I say it even mentioned a 12-core in the pipeline? Imagine dual socket, 24-core @ 4Ghz. I can assure you no Haswell in the coming years will beat that.tencies but lower latencies might provide better performance in this case.
That's a reason to wait for Haswell, not the reverse.
Intel's pattern over the past few years has been to introduce new architecture on the tock part of the cadence (second gen on the same process node), and at the same time (roughly), release the Xeon versions (and other high-end workstation chips) based on the previous-gen's (tick) architecture. Right now, the high end chips are all based on Sandy, but we're likely to get Ivy-based Xeons sometime around June. It is all about their cadence and systematic refinement.
Even if you don't buy a Haswell, I think there is a reason to wait for Haswell, assuming you already have a high-performing Sandy (which, he does). In other words, rather than the current 2xxx-series Xeons, we'll have 3xxx series soon.
Also...
Intel is promising a 10-15% speedbump from Haswell over Ivy. Ivy promised a similar speedbump over Sandy. But practically what we've seen is more around 5% (http://www.overclock.net/t/1242313/more-ivy-bridge-benchmarks-sandybridge-comparison-3770k-vs-2600k-performance-temps-etc-couple-of-ln2-scores-are-up/0_50).
It is all workload dependent. I've seen plenty of benchmarks where Ivy shows (even dramatic) results of its IPC gains. As I explained above, Sandy (being a tock) is likely to be better for overclocking (it is). But at identical clockspeeds, many applications do show speedups of between 5-15%. The average is certainly not just "around 5%". It just depends on what you do (though many games have tended to be more on the lower-end of the scale).
In any case, I don't think we're in the sweet spot for buying right now. It also isn't the worst time possible (as April/May would be), but... If you can (and can still go through with it in June/July as easily), I'd wait a piece.
-
I've just run JRMark on my 16GB Ivy 3770k system running @4.5Ghz. It has a Vertex4 256GB system drive. It should be pretty speedy, yet it only scores around 4900 JRMark. So that's a pretty speedy Sandy Bridge system you have there!
Probably your best speedup now might be to go for a 6-core Sandy Bridge 3960X, which might be as simple as swapping the chip on your current system: http://www.anandtech.com/show/5091/intel-core-i7-3960x-sandy-bridge-e-review-keeping-the-high-end-alive/5
I just ran the JRMark (18.0.124) on an Ivy i5 3570k @ 4.4 with 8GB of RAM and 2 Kingston HyperX 128GB SSDs in RAID 0 (striped) and got a 5362. This puzzles me a bit since a 3770k @ 4.5 should receive better numbers as I can't see my RAID SSDs effecting the result that much (could be wrong though).
Matt I would go with a multiple CPU setup if I wanted/needed more speed since IMHO no single socket setup out there is going to make a big enough difference to warrant the cost.
-
Yeah, I didn't spend a lot of time on it, just dashed up ran the test and posted. Maybe I should look into it a bit more.
Where does the test read/write data? Only to appData, or elsewhere? All my docs/pictures/music are on HDD but the main Users folder (inc. appData) is on SSD.
-
You mentioned server hardware, but I can tell you the common server hardware from Dell or HP will be slower than what you can build yourself for less. That stuff is built to be reliable, not to be the fastest kid on the block. You say ECC would be a plus but I think you're wrong. Buffered ECC is quite a bit slower and although it sounds good, I can't remember the last time I had to have memory replaced in one of our servers. I remember it happened once and even that time it turned out to be the mainboard. Mind you I'm working in a datacenter with over a 1000 systems and I've been here since 2004. Some of that hardware is still running. Physical memory failures are rare and if there are issues due to memory, its because its incompatible with the mainboard, its incorrectly configured, incompatible sticks are stuck together or its running too hot. Solve those, run a 24-hour stresstest and you can very certain memory won't be an issue. Manufacturers can give lifetime warranty on their sticks - they don't do that if its even remotely likely to fail.
While, I agree with you generally about ECC for this situation, preventing RAM "failure" is not what ECC is really all about. ECC prevents transient errors from impacting individual transactions (which might not otherwise be detectable), to protect against potentially small, impossible to "reproduce", errors in results. These can come from a variety of sources (http://en.wikipedia.org/wiki/Cosmic_ray), but it can and does exist (and is getting worse as node sizes decrease).
You're right, ECC is slower. However, memory latency has an overall very small effect on the performance of the system, once you reach the "sweet spot" of the architecture in question. Anand has some very nice writeups on this, but it amounts to: DDR-1600 is the sweet spot for Sandy/Ivy-based designs. You do get more performance with going higher (dramatically in some memory bandwidth sensitive applications), but the rate of returns drops off sharply. Since low-latency DDR's prices scale essentially "in the reverse", there is usually not a good ROI on going with high-speed or low-latency (same thing, different branding) RAM.
Still, I'd also agree, that ECC probably wouldn't help them much here. It is essential for things like HPC for scientific applications. Here... I don't know, you'd just rebuild again.
-
You say ECC would be a plus but I think you're wrong. Buffered ECC is quite a bit slower and although it sounds good, I can't remember the last time I had to have memory replaced in one of our servers.
I've had bad memory in my main home machine twice over the course of three machines.
It's a small sample size, but I just don't trust memory as a result.
When it goes south, it leaves you with a big mess: "Oh, any file copy I did in the last month might be corrupt."
-
Hmm. Updated my 3770K machine to .124 (it was on .117) and now get JRMark of 5281. No other changes, not even a reboot.
=== Running Benchmarks (please do not interrupt) ===
Running 'Math' benchmark...
Single-threaded integer math... 3.015 seconds
Single-threaded floating point math... 2.001 seconds
Multi-threaded integer math... 0.853 seconds
Multi-threaded mixed math... 0.530 seconds
Score: 2969
Running 'Image' benchmark...
Image creation / destruction... 0.162 seconds
Flood filling... 0.343 seconds
Direct copying... 0.543 seconds
Small renders... 0.851 seconds
Bilinear rendering... 0.655 seconds
Bicubic rendering... 0.620 seconds
Score: 6933
Running 'Database' benchmark...
Create database... 0.297 seconds
Populate database... 1.031 seconds
Save database... 0.268 seconds
Reload database... 0.054 seconds
Search database... 0.747 seconds
Sort database... 0.724 seconds
Group database... 0.497 seconds
Score: 5942
JRMark (version 18.0.124): 5281
No idea if it is the upgrade that changed the result or just the time of day :D
One OT point, my HTPC (3550K @3.4Ghz, 8GB RAM, 128GB Vertex4 SSD) has a JRMark of 4640=== Running Benchmarks (please do not interrupt) ===
Running 'Math' benchmark...
Single-threaded integer math... 3.569 seconds
Single-threaded floating point math... 2.373 seconds
Multi-threaded integer math... 1.797 seconds
Multi-threaded mixed math... 1.236 seconds
Score: 2117
Running 'Image' benchmark...
Image creation / destruction... 0.186 seconds
Flood filling... 0.375 seconds
Direct copying... 0.426 seconds
Small renders... 1.008 seconds
Bilinear rendering... 0.651 seconds
Bicubic rendering... 0.593 seconds
Score: 6792
Running 'Database' benchmark...
Create database... 0.369 seconds
Populate database... 1.186 seconds
Save database... 0.267 seconds
Reload database... 0.039 seconds
Search database... 0.894 seconds
Sort database... 0.914 seconds
Group database... 0.622 seconds
Score: 5010
JRMark (version 18.0.124): 4640
In most ways my 3770K shows a big improvement, but the image benchmark not so much! What does the image benchmark test?
-
It is all workload dependent. I've seen plenty of benchmarks where Ivy shows (even dramatic) results of its IPC gains. As I explained above, Sandy (being a tock) is likely to be better for overclocking (it is). But at identical clockspeeds, many applications do show speedups of between 5-15%. The average is certainly not just "around 5%". It just depends on what you do (though many games have tended to be more on the lower-end of the scale).
Workload dependent means its situational right? ;) I've seen big dramatic benchmark results too, but how were they made? They are either very situational or made on 2 setups, different mobo's, maybe even different chipsets. The benchmark link I gave is particularly interesting if you ask me because its "just" an enthusiast, like you and me. He swapped the CPU in a system, he didn't compare 2 systems. He has nothing to gain by posting optimistic results like some big hardware sites reviewing the very stuff they advertise elsewhere on their site. If you look at his Pi and Prime benchmarks, the results are less than 5%.
Look at AnandTech's compilation results (http://www.anandtech.com/show/5771/the-intel-ivy-bridge-core-i7-3770k-review/6), we see an increase from 17.7 minutes to 18.6. Only just over 5% increase. I quote:
Compile Chromium Test
You guys asked for it and finally I have something I feel is a good software build test. Using Visual Studio 2008 I'm compiling Chromium. It's a pretty huge project that takes over forty minutes to compile from the command line on a Core i3 2100. But the results are repeatable and the compile process will stress all 12 threads at 100% for almost the entire time on a 980X so it works for me.
Ivy Bridge shows more traditional gains in our VS2008 benchmark—performance moves forward here by a few percent, but nothing significant. We are seeing a bit of a compressed dynamic range here for this particular compiler workload, it's quite possible that other bottlenecks are beginning to creep in as we get even faster microarchitectures.
Averaged out, I think I'm even on the high side with 5%.
-
Updated my 3770K machine to .124 (it was on .117) and now get JRMark of 5281.
JRMark scores often get better with newer builds.
JRMark is testing raw machine performance and also our code. So when our code or compiler gets faster, the JRMark improves.
If you compare JRMark scores, it's important to use similar versions of the program.
At some point, I think we'll probably rebalance the tests (and maybe add a few things) because some of the performance improvements have changed the weighting quite a bit.
Here's an example:
http://yabb.jriver.com/interact/index.php?topic=70947.5
-
I've had bad memory in my main home machine twice over the course of three machines.
It's a small sample size, but I just don't trust memory as a result.
When it goes south, it leaves you with a big mess: "Oh, any file copy I did in the last month might be corrupt."
Glynor brings up some good points too which are very true. If you're worried about it then by all means go for it, if only for piece of mind. The speed tradeoff is a minor one.
-
There would certainly be no point upgrading to current IB chips. It might be worth waiting until there is an IB 6/8-core chip, whenever that might be. But the 3960/70X seems like it should offer a major boost quite simply.
-
JRMark scores often get better with newer builds.
JRMark is testing raw machine performance and also our code. So when our code or compiler gets faster, the JRMark improves.
If you compare JRMark scores, it's important to use similar versions of the program.
At some point, I think we'll probably rebalance the tests (and maybe add a few things) because some of the performance improvements have changed the weighting quite a bit.
Here's an example:
http://yabb.jriver.com/interact/index.php?topic=70947.5
Thanks Matt, just interested that the image tests don't improve much with the major clock speed leap between the 2 systems (both on .124). Some sub-tests are even slower on the 3770K system. What bits of the system is it testing. When writing files (if it does that) where does it write them?
-
Thanks Matt, just interested that the image tests don't improve much with the major clock speed leap between the 2 systems (both on .124). Some sub-tests are even slower on the 3770K system. What bits of the system is it testing. When writing files (if it does that) where does it write them?
The image benchmark will test raw computing power and memory performance (images are big chunks of memory). Both are important.
The only disk I/O is in the database test, and there's not much of it.
-
However, memory latency has an overall very small effect on the performance of the system, once you reach the "sweet spot" of the architecture in question. Anand has some very nice writeups on this, but it amounts to: DDR-1600 is the sweet spot for Sandy/Ivy-based designs. You do get more performance with going higher (dramatically in some memory bandwidth sensitive applications), but the rate of returns drops off sharply. Since low-latency DDR's prices scale essentially "in the reverse", there is usually not a good ROI on going with high-speed or low-latency (same thing, different branding) RAM.
Yeh, I know and you're right. But remember that's under normal circumstances. They *only* reason I bring that up is because I don't really know what happens when you start compiling on 16 or more cores. I don't think anyone has ever tested what effect low latency memory does in such a case (maybe none :P). I do know that on heavily overclocked systems, lower latencies can have a dramatic effect on cpu heavy things like compression/decompression whereas on that same system not overclocked, it barely has an effect. So I was thinking under such loads, with so many threads poking around in memory, lower CAS latencies might speed things up as they do on a heavily overclocked system. But you know, its all guesswork and I honestly don't know.
-
The image benchmark will test raw computing power and memory performance (images are big chunks of memory). Both are important.
The only disk I/O is in the database test, and there's not much of it.
I can only believe it is memory limited (at least on these systems). Both are running RAM at the same speed.
-
Rebooted and ran again (no other changes) and I get 5490! Image score much improved ?
=== Running Benchmarks (please do not interrupt) ===
Running 'Math' benchmark...
Single-threaded integer math... 3.019 seconds
Single-threaded floating point math... 2.000 seconds
Multi-threaded integer math... 0.891 seconds
Multi-threaded mixed math... 0.613 seconds
Score: 2913
Running 'Image' benchmark...
Image creation / destruction... 0.156 seconds
Flood filling... 0.325 seconds
Direct copying... 0.435 seconds
Small renders... 0.852 seconds
Bilinear rendering... 0.574 seconds
Bicubic rendering... 0.527 seconds
Score: 7671
Running 'Database' benchmark...
Create database... 0.316 seconds
Populate database... 1.067 seconds
Save database... 0.265 seconds
Reload database... 0.048 seconds
Search database... 0.748 seconds
Sort database... 0.725 seconds
Group database... 0.484 seconds
Score: 5886
JRMark (version 18.0.124): 5490
Enough now
-
E-peen alert! ;D
-
Thanks for all the advice everyone.
This thread got me looking into the concurrency of compiling a little more. I noticed that Visual Studio can now use multiple cores to compile a single project:
http://msdn.microsoft.com/en-us/library/vstudio/bb385193.aspx
Currently we compile multiple projects at a time, but don't enable multi-processor compiling of a single project. Hopefully turning that on for the bigger projects (or maybe all of them) will win us some additional performance.
What about using a product like Incredibuild? It pushes builds out to idle PCs on the network. I've never had a project big enough to need it but there is a free trial;integrates with Visual Studio. Supposedly allows parallelization without any source file or hardware changes. (I love advertising blurbs)
-
Look at AnandTech's compilation results (http://www.anandtech.com/show/5771/the-intel-ivy-bridge-core-i7-3770k-review/6), we see an increase from 17.7 minutes to 18.6. Only just over 5% increase.
Averaged out, I think I'm even on the high side with 5%.
Oh, I agree. With many workloads the bump from Sandy to Ivy is minor (more minor than previous ticks in the cycle had been). They focused on power efficiency improvements with the node change this time.
Though other applications can get substantial improvements... Photoshop, for one, sees some pretty nice gains.
To be clear: I was never suggesting that Ivy would be a good upgrade from a Sandy system. Quite the contrary! I was just saying that when you go two "cycles" (two years on their current cadence), you get cumulative improvements. Quite often, you'll see many of the underlying areas that didn't get dramatic improvements in the last cycle, gain additional focus in the next. Therefore, it isn't usually the "worst examples" of bumps that you end up with over two+ cycles, but closer to the median. It averages out higher than you were seeming to indicate, is what I'm saying, when you include multiple tick/tock cycles. Iteration wins the race, in other words.
Waiting for Haswell to fully ship allows a few things:
1. You can see what has been improved in the new architecture, with shipping silicon.
2. If you decide to go with a "desktop" chip (Core i7), you'll probably have better overclocking potential with Haswell than Ivy (which will be similar to, but maybe a bit worse-than, Sandy).
3. More importantly, Ivy Bridge-E is coming in 2013 (http://www.guru3d.com/news_story/8_core_sandybridge_e_and_6_12_core_ivybridge_e_in_2013.html) (probably within a few months of Haswell, just like before), and will be a drop-in upgrade for current X79 LGA2011 systems. Ivy-E is expected to ship in 6-12 core versions. They are also shipping improved versions of the existing Sandy Bridge-E chips in Q2 2013.
In summary: There are new options in the LGA2011 space coming. I'd wait for them, and compare them to the (then available) Haswell chips, and decide which has better bang-for-your-buck.
-
Waiting for Haswell to fully ship allows a few things:
Yeh I totally agree, if you can wait it can definitely be worth it.
Although I tried to give a balanced view I admit I focused more on what's available and less on whats coming; you laid that out quite nicely.
-
Thanks for all the advice everyone.
This thread got me looking into the concurrency of compiling a little more. I noticed that Visual Studio can now use multiple cores to compile a single project:
http://msdn.microsoft.com/en-us/library/vstudio/bb385193.aspx
Currently we compile multiple projects at a time, but don't enable multi-processor compiling of a single project. Hopefully turning that on for the bigger projects (or maybe all of them) will win us some additional performance.
Yea, that's exactly the kind of feature I was talking about. It can make a HUGE difference! It only helps for release builds though according to the note at the bottom:
"The /Gm Compiler Option
By default, a project build enables the /Gm compiler option (incremental builds) for debug builds, and disables it for release builds. Therefore, the /MP compiler option is automatically disabled in debug builds because it conflicts with the default /Gm compiler option."
-
What about using a product like Incredibuild? It pushes builds out to idle PCs on the network. I've never had a project big enough to need it but there is a free trial;integrates with Visual Studio. Supposedly allows parallelization without any source file or hardware changes. (I love advertising blurbs)
A distributed build system like that isn't as appealing as it used to be because there's a limit to how much can be parallelized and today's CPUs are capable of delivering significant parallelism. Back when we only had single and dual core processors with a single hardware thread, something like Icredibuild was great.
But now we have quad, six, and 8-core CPUs with two hardware threads.
Put two Intel 6-core CPUs with hyper-threading and you have 24 parallel units of execution. That's some serious parallelism, and without the network latency.
-
Matt do you use include guards or #pragma once in your header files (Assuming MC18 is written in C or C++)? Check this out:
http://www.bobarcher.org/software/include/index.html
-
Hi Matt,
I think until ivy-e ships later this year your best bet is a 3970X or even a 3930K. (with a new MB and 1600Mhz Ram of course)
Changing over to a Xeon is only worth it IMHO if you plan on running dual CPU and can run highly parallel apps and compiling.
The top end Xeons are more than twice the price for about 75% of the Ghz of 3970X and only 2 more cores and this cant make up the deficit in Ghz.
In all the synthetic and workload tests Ive read the i7 3970X and 3930K both match or beat a single E5 2690 with just a mild overclock in most tests. Not even in highly multithreaded apps will the Xeon beat the 3930K or a moderately OC 2600K.
The i7 3930K will easily OC to 4.4Ghz and the Xeon 8 cores dont OC at all! Here's some benchmarks for you that someone ran over at tomshardware.
Even your current Sandy 2600K with a moderate overclock would be about 10mins faster than the Xeon 2670 in this test below. (Thats 20% FASTER)
You cant beat Ghz! :)
CS5.5
Video material - AVCHD 1080P 24 Frame Each Cut to 30 minutes of material
Export Codec - H264 HDTV 1080P 24 Preset Default
4 Effects per Layer - Fast Color Corrector, Brightness & Contrast, Video Limiter, Sharpen
Each Layer Scaled to 50% for 4 frame PinP view.
E5 2670 @ 2.6 GHz 8 CORE
32GB 1600
570GTX 2.5GB
4 1Tb Sata 32 Meg Cache 600 Drives in 2 Raid 0 arrays
CS5.5.2
3 Layer -
4 Layer - 40:41
X79 3.3 @ 3.8 GHz
32GB 1333
580GTX 3GB
4 1Tb Sata 32 Meg Cache 600 Drives in 2 Raid 0 arrays
CS5.5.2
3 Layer - 32:15
4 Layer - 35:19
X79 3.3 @ 4.5 GHz
32GB 1333
580GTX 3GB
4 1Tb Sata 32 Meg Cache 600 Drives in 2 Raid 0 arrays
CS5.5.2
3 Layer - 27:43
4 Layer - 30:02
I7 2600K 4.7 GHz 4 core
16GB Blackline 1600 CL 9
570GTX
4 WD 1Tb Sata 64 Meg Cache 600 Drives in 2 Raid 0 arrays
3 Layer - 30:46
4 Layer - 33:36
Just for reference here's my 3930K at 4.4Ghz with JRMARK of 5368
Auto OC which most boards/i7 CPU do on full auto, I'll reboot now and do a bench at 5ghz.
JRMark (version 18.0.120): 5368 @ 4.4Ghz
JRMark (version 18.0.120): 6128 @ 5Ghz
=== Running Benchmarks (please do not interrupt) ===
Running 'Math' benchmark...
Single-threaded integer math... 3.090 seconds
Single-threaded floating point math... 2.044 seconds
Multi-threaded integer math... 0.650 seconds
Multi-threaded mixed math... 0.537 seconds
Score: 3006
Running 'Image' benchmark...
Image creation / destruction... 0.158 seconds
Flood filling... 0.422 seconds
Direct copying... 0.358 seconds
Small renders... 0.908 seconds
Bilinear rendering... 0.609 seconds
Bicubic rendering... 0.548 seconds
Score: 7327
Running 'Database' benchmark...
Create database... 0.331 seconds
Populate database... 1.077 seconds
Save database... 0.217 seconds
Reload database... 0.030 seconds
Search database... 0.811 seconds
Sort database... 0.779 seconds
Group database... 0.481 seconds
Score: 5771
JRMark (version 18.0.120): 5368
.... and at 5Ghz... ;)
=== Running Benchmarks (please do not interrupt) ===
Running 'Math' benchmark...
Single-threaded integer math... 2.727 seconds
Single-threaded floating point math... 1.806 seconds
Multi-threaded integer math... 0.626 seconds
Multi-threaded mixed math... 0.402 seconds
Score: 3417
Running 'Image' benchmark...
Image creation / destruction... 0.137 seconds
Flood filling... 0.376 seconds
Direct copying... 0.299 seconds
Small renders... 0.800 seconds
Bilinear rendering... 0.520 seconds
Bicubic rendering... 0.475 seconds
Score: 8441
Running 'Database' benchmark...
Create database... 0.299 seconds
Populate database... 0.923 seconds
Save database... 0.186 seconds
Reload database... 0.026 seconds
Search database... 0.709 seconds
Sort database... 0.687 seconds
Group database... 0.466 seconds
Score: 6524
JRMark (version 18.0.120): 6128
-
I think until ivy-e ships later this year your best bet is a 3970X or even a 3930K. (with a new MB and 1600Mhz Ram of course)
Do you know which mainboard he has? Chances are that an in-place upgrade of the cpu is possible. Maybe he already has 1600Mhz RAM :).
Changing over to a Xeon is only worth it IMHO if you plan on running dual CPU and can run highly parallel apps and compiling.
The top end Xeons are more than twice the price for about 75% of the Ghz of 3970X and only 2 more cores and this cant make up the deficit in Ghz.
3.8 (E5-2690) -> 4Ghz (3970X) = 5%. It's 200Mhz times 6 = 1.2Ghz. The 2 cores from the Xeon would basically add 7.6Ghz, 6.33 times more than the speed increase of the 3970X. So if compiling is as heavily threaded as they say it is, it should scale much better with more cores than it does with a bit more clockspeed.
In all the synthetic and workload tests Ive read the i7 3970X and 3930K both match or beat a single E5 2690 with just a mild overclock in most tests. Not even in highly multithreaded apps will the Xeon beat the 3930K or a moderately OC 2600K.
http://www.tomshardware.com/reviews/core-i7-3970x-sandy-bridge-e-benchmark,3348-8.html
Nuf said :).
The i7 3930K will easily OC to 4.4Ghz and the Xeon 8 cores dont OC at all! Here's some benchmarks for you that someone ran over at tomshardware.
Even your current Sandy 2600K with a moderate overclock would be about 10mins faster than the Xeon 2670 in this test below. (Thats 20% FASTER)
An E5-2670 is clocked at 2.6/3.3Ghz against 2.9/3.8Ghz for the E5-2690. Besides, all those benchmarks are taken on different hardware, different chipsets and different memory so that's hardly a valid comparison.
You need to OC an 3970X by 25% to beat an E5-2690 on raw processing power. But even then it has less cache on all levels, something I believe can be quite beneficial for large projects.
You cant beat Ghz! :)
This used to be true when programs were less multithreaded and often even being single threaded, optimized for single core processors. This is has been changing since multicore cpu's have become more common and I don't believe it holds up anymore, except for some rare situations.
-
Do you know which mainboard he has? Chances are that an in-place upgrade of the cpu is possible. Maybe he already has 1600Mhz RAM :).
3.8 (E5-2690) -> 4Ghz (3970X) = 5%. It's 200Mhz times 6 = 1.2Ghz. The 2 cores from the Xeon would basically add 7.6Ghz, 6.33 times more than the speed increase of the 3970X. So if compiling is as heavily threaded as they say it is, it should scale much better with more cores than it does with a bit more clockspeed.
http://www.tomshardware.com/reviews/core-i7-3970x-sandy-bridge-e-benchmark,3348-8.html
Nuf said :).
An E5-2670 is clocked at 2.6/3.3Ghz against 2.9/3.8Ghz for the E5-2690. Besides, all those benchmarks are taken on different hardware, different chipsets and different memory so that's hardly a valid comparison.
You need to OC an 3970X by 25% to beat an E5-2690 on raw processing power. But even then it has less cache on all levels, something I believe can be quite beneficial for large projects.
This used to be true when programs were less multithreaded and often even being single threaded, optimized for single core processors. This is has been changing since multicore cpu's have become more common and I don't believe it holds up anymore, except for some rare situations.
I still stand by my comments and the tests I provided which are based on realistic OC of the i7 and the hardware was as common as possible for the testing. The link you posted for the CPU testing was all at stock speeds and as you can clearly see the Xeon is hardly faster even without overclocking the i7. The 3930K is the best bang for buck by a long shot...
Just look at my JRMark's at 4.4Ghz & 5Ghz. I think those kinds of increase in performance from OC would leave the Xeon for dead. Going from 12MB cache on 3930K or 15MB on 3970K to 20MB cache on the Xeon may make a small improvement but i guarantee you it will not catch the OverClocked 12 threads of an i7.
JRMark (version 18.0.120): 5368 @ 4.4Ghz
JRMark (version 18.0.120): 6128 @ 5Ghz
The Beast known as the "Borg Cube"... For good reason. :) It has assimilated so much hardware!
(http://ttu5vg.blu.livefilestore.com/y1pMBIQcaihco0wACNU9_mg-inhqoOgvYtsqh-qj5TtMuY45aZT6qo_3HdndOUP0BL9Gj9_ZGWXPOZa3urujJEI51YYbj_Bry6L/IMG_4135.JPG)
(http://ttu5vg.blu.livefilestore.com/y1p0dtO6dnsIIUZLbx6KpAXwQr4ikmlugskUevEBYG-Xb5sR43P9jKCxanJM-pTPWSDPuLSGW9DhbiBmZwWyQUhRkz0AUW0cNzp/IMG_4115.JPG)
(http://ttu5vg.blu.livefilestore.com/y1pWSoAxQpn9FhMN1iqsuWdi4hGsIceo3RawoFji-C8usPbP32zduXMVMWgOGCvsq0SP_FVKTKuSdDjTcOfynkjxWjDY7t7iU2p/IMG_4112.JPG)
(http://ttu5vg.blu.livefilestore.com/y1p8M4fl8wxBRy1NJnD1el0eCUN_v-mtD23P6HezOKYMK_bPo89TQXsUB-H7d_EJ0baRSIuIXPcct4xxE342OBmevtCEYDObCqF/IMG_4168.JPG)
-
That's sweet machine you've got there.
The 3930K is the best bang for buck by a long shot...
I never argued against that ;)
But that wasn't the question.
It was:
Is there anything available that would be considerably faster, even a Xeon / server machine?
And
Water cooling or hard core overclocking would be a little too much.
But to be clear, I'm not saying you're wrong, you do bring up valid points for Matt to consider ::).
-
Holy cow, that computer is bigger than my coffee table. Are those NINE 140mm fans in the front AND another layer of nine fans behind it?!? You could power a wind tunnel with that!
And I have to concur with hiltonk's assessment. The biggest gain to be had is in overclocking, and that's even more true for compiling. Cache sizes don't really matter, it's the raw speed and number of cores. I'd go by a simple formula of X Cores * Y Ghz = Z Ghz and go for the whatever gets you the biggest Z for your budget. A six core i7 at 4.5 ghz gives a Z of 27, and you can't beat that with an 8-core Xeon until you go all the way up to an E5-2680 (3.5 ghz Turbo Boost) which is an $1800 CPU! So yea, you might be able to beat an i7 with a Xeon but the difference will be marginal and at a cost of $1200 more. And unless your task is consuming 100% of the available parallelism of 2 additional cores at all times, whatever performance increase you might get with the Xeons will quickly evaporate and the OC'd i7 will again take the lead.
Surprisingly, a bit counter-intuitively, the 32nm Sandy Bridge chips are more overclockable chip than the 22nm Ivy Bridge which has problems with heat dissipation because the chips are so dense with the 3d-transistor design.
My vote also goes to the i7-3930K
http://www.newegg.com/Product/Product.aspx?Item=N82E16819116492
Should overclock to well over 4 ghz no problem, without water cooling, and without even a hint of unstability. "Hardcore" overclocking would be trying to push 5 ghz on air. It also has a four-channel memory controller (another nice boost to parallelism) so you'd want 4 sticks of RAM to take full advantage.
-
Also gotta have an elite-level motherboard like:
http://www.newegg.com/Product/Product.aspx?Item=N82E16813131808#top
-
I know you mentioned that everything was on SSDs Matt, but are they RAIDed? You mentioned going to SSDs was a big boost, you might get a similar boost going to SSDs in a striped RAID array.
-
Holy cow, that computer is bigger than my coffee table. Are those NINE 140mm fans in the front AND another layer of nine fans behind it?!? You could power a wind tunnel with that!
Well close.. There are 18 push & pull 120mm fans for the 3 GPU radiators at the front and another 6 at the back for the CPU radiator. And then there's also 2 exhaust fans at the back as well. Its not as bad as it sounds. The fans idle at only 700rpm which is completely silent and they rarely go above 1200rpm which is still very quiet. :) It certainly makes more noise when under full load with all fans at about 1900RPM but its still quieter than the factory air coolers.
The system draws 1400W PSU1 and 900W PSU2 from the wall from each PSU at 240V under full load so it needs serious cooling with about 400W for each GPU and the CPU too at 5Ghz. The fans and pumps draw the rest of the power. :) At idle and normal web surfing and light app loads it still idles at 230W PSU1 / 170W PSU2 so its not too bad unless im gaming or encoding.
-
Thanks for all the feedback so far everyone.
This week I spent some time writing a build system that uses a custom C++ program instead of using batch files. It gives us more flexibility and the ability to automate some extra steps in the build.
Using the new tool it takes about 8 minutes to clean, build, virus check, sign, package, install for a smoke test, and upload the build. This is using the machine I mentioned in the first post.
The /MP mentioned above shaved some time off, although it's a little touchy. At first I tried turning it on for all projects so there were 64 compilation threads running at once (8 projects x 8 threads each). My computer blue screened for the first time ever :o
Anyway, we're going to try living with the new system for a few weeks and then we'll decide if we should throw a 3930K, RAIDed SSDs, etc. at the problem.
I think optimizing the build program and doing more compiler setting optimization could each shave a minute off the process, although it would be easy to spend a couple days of development time for each of those minutes :P
-
My computer blue screened for the first time ever :o
A blue screen is a hardware or driver problem.
:-)
-
Thanks for all the feedback so far everyone.
This week I spent some time writing a build system that uses a custom C++ program instead of using batch files. It gives us more flexibility and the ability to automate some extra steps in the build.
Using the new tool it takes about 8 minutes to clean, build, virus check, sign, package, install for a smoke test, and upload the build. This is using the machine I mentioned in the first post.
The /MP mentioned above shaved some time off, although it's a little touchy. At first I tried turning it on for all projects so there were 64 compilation threads running at once (8 projects x 8 threads each). My computer blue screened for the first time ever :o
Anyway, we're going to try living with the new system for a few weeks and then we'll decide if we should throw a 3930K, RAIDed SSDs, etc. at the problem.
I think optimizing the build program and doing more compiler setting optimization could each shave a minute off the process, although it would be easy to spend a couple days of development time for each of those minutes :P
8 minutes vs what previously?
Have you tried just compiling one project at a time with the /MP feature? 64 concurrent threads actually seems counter-productive. I think you'll have the best luck sticking to 8 compilation threads (4 cores x 2 HW threads per core). Either by not using the /MP feature and doing the 8 projects at a time, or by using /MP with 8 threads but only one project at a time.
-
Looks like 7-13% from Haswell:
http://www.tomshardware.com/reviews/core-i7-4770k-haswell-performance,3461.html
Overclocking is the big question mark. It requires no effort to get around 4.3 GHz with a 3770k, and I wonder if a 4770k will best that?
-
Looks like 7-13% from Haswell:
http://www.tomshardware.com/reviews/core-i7-4770k-haswell-performance,3461.html
Overclocking is the big question mark. It requires no effort to get around 4.3 GHz with a 3770k, and I wonder if a 4770k will best that?
On the recent Tech Report podcast they discussed some of the early Haswell benchmarks (including these). The overall bump is as-expected, but Scott Wasson also reported on overclocking a bit, and said that early insider reports are that it is closer to Sandy-like overclockability than Ivy-like overclockability.
Which "fits" based on the focus on efficiency, and a new architecture on a mature process node. If so, that is Very Good News. Ivy isn't a particularly good overclocker.
-
On the recent Tech Report podcast they discussed some of the early Haswell benchmarks (including these). The overall bump is as-expected, but Scott Wasson also reported on overclocking a bit, and said that early insider reports are that it is closer to Sandy-like overclockability than Ivy-like overclockability.
Which "fits" based on the focus on efficiency, and a new architecture on a mature process node. If so, that is Very Good News. Ivy isn't a particularly good overclocker.
If true, I'll be in line at Microcenter on launch day to get my Haswell. Of course, I'll have to fight this guy for the first one:
(http://4.bp.blogspot.com/_WHI2rSW1LFo/S7q3W244NbI/AAAAAAAADn4/b0blKN0_2yU/s400/The_Simpsons-Jeff_Albertson.png)
-
At what price point are we playing this game?
On short: is there anything to make one enthusiastic about Haswell vs launch price + new motherboard?
I don't quite see it. Will it overclock to 5GHz on air (and stay there, not just for benchmarks) with a flip of a switch? An i7-3770k now is $329 (I know, I know Microcenter has them at $229 but I don't think that's the rule). 4770k will launch at what? Same? 7-13% $$$ more (to be consistent :)) ? Or $399? How much will a new spanking motherboard cost for this 'jewel' that comes without GT3 and other stuff? $150-200?
So $500-600 the whole thing. I'm not sure it's an easy sell for a new PC, but for an upgrade?!?
-
So $500-600 the whole thing. I'm not sure it's an easy sell for a new PC, but for an upgrade?!?
Have you been talking to my wife?
-
Have you been talking to my wife?
You need multiple systems. I'll probably upgrade my HTPC and then trickle down the Ivy i5, MB and RAM to a new dedicated router/firewall. Or maybe upgrade the parents computer with my old one. Keeps them active longer for less stressful tasks nd therefore easier to justify upgrading as multiple systems get the benefit of the same $$$
-
How long does it take to compile now?
-
Latest rumor - Core i7-4770K - $368. Which is... ~12% more than a Core i7-3770K. Hehe, the math aligns :).
-
Coming in late to the party but...
On 'server class' machines, IBMs in-house x86 chipsets out-perform Intels by a fair margin. Mostly in memory bandwidth but also on the PCI and SATA busses.
I have no idea what a current=generation IBM xSeries machine with an IBM chipset costs these days.
-
Looks like Haswell will be a good overclocker. Or, at least, they'll give you to the tools to get there if your chip can do it. We got an important fiddly widget back (http://www.anandtech.com/show/6898/intel-details-haswell-overclocked-at-idf-beijing):
The default BCLK for Haswell parts will remain at 100MHz, however now you'll have the ability to select 125MHz or 167MHz as well. The higher BCLK points are selectable because they come with different dividers to keep PCIe and DMI frequencies in spec. At each of these BCLK settings (100/125/167MHz), the typical inflexbility from previous architectures remain. Intel's guidance is you'll only be able to adjust up/down by 5 - 7%.
Obviously we'll still have K-series SKUs with fully unlocked multipliers. Intel claims the CPU cores will have ratios of up to 80 (8GHz max without BCLK overclocking, although you'll need exotic cooling to get there). Some parts will also have unlocked GPU ratios, with a maximum of 60 (GPU clock = BCLK/2 * ratio, so 3GHz max GPU clock).
Memory overclocking is going to be very big with Haswell. Intel will offer support for 200MHz steps up to 2.6GHz and 266MHz steps up to 2.66GHz on memory frequency, with a maximum of 2.93GHz memory data rate supported.
Getting baseclock flexibility back in addition to unlocked multipliers is huge (we haven't really had that since the Pentium II). Those aren't the world's most flexible dividers for the PCIe and DMI freqs, but it should still help a lot since you can combine them with multiplier fiddling. Overclocking Haswell well will likely be more challenging than the past couple generations though. Combining baseclock and multiplier adjustments (while watching the impact on other related buses) isn't dead-simple like turning a multiplier up or down and testing, but it does give you more headroom if you know what you're doing.
-
Looks like Haswell will be a good overclocker.
That big GPU still makes me nervous.
The thing could leak like a sieve when pushed (or not). There are rumors that LN2 overclocking is awesome, but that doesn't mean much for overclocking on air. If the chips are thermally limited, you can "fix" it by cooling the bejeezus out of them. AMD's Phenoms and Phenom IIs were like this (as were the IBM PowerPC CPUs Apple used in the G5). They could throw up unbelievable LN2 overclocks, but were very limited on air or with average consumer-grade liquid cooling rigs.
We'll see. Should be an interesting time. If they're crappy overclockers, my bet is on the GPU, and then I'd wait for Haswell E and see what it brings.
-
The Haswell embargo lifted this morning:
http://www.hardocp.com/article/2013/06/01/intel_haswell_i74770k_ipc_overclocking_review
http://anandtech.com/show/7003/the-haswell-review-intel-core-i74770k-i54560k-tested
etc.
-
Well that hardly seems worth the cost of upgrading, unless you need better integrated graphics, or were already at the absolute upper limits of performance and need something faster. Ivy Bridge seemed the same way.
Is Intel focusing too much on reducing power consumption? (which also doesn't seem like a significant reduction?)
Maybe they need to start focusing on 6-core or 8-core consumer CPUs like AMD.
-
Is Intel focusing too much on reducing power consumption? (which also doesn't seem like a significant reduction?)
I was expecting more impressive idle power numbers. I'm typing this on a 2500k computer in my living room that idles at just over 30 watts. It doesn't look like the 4770k is really much (if any) better.
And since it's not much faster, and doesn't overclock much better, the release is a bit more of a yawn than I was hoping for.
-
I was expecting more impressive idle power numbers. I'm typing this on a 2500k computer in my living room that idles at just over 30 watts. It doesn't look like the 4770k is really much (if any) better.
And since it's not much faster, and doesn't overclock much better, the release is a bit more of a yawn than I was hoping for.
That seems very low for a 2500K system. Mine is more than double that! It's overclocked, I do have eight drives inside it, and a GTX 570 though…
Anandtech's review indicates that there should be a 15–20W reduction (http://images.anandtech.com/graphs/graph7003/55329.png), but if you are only measuring 30W, that seems unlikely.
One of the reasons I have been looking forward to Haswell has been new NUCs or Mac Minis though. Since my system won't sleep (hardware conflict) my system has been on a lot more now that I am using Media Center.
So I really like the idea of having a system with less than 10W power consumption for when I'm just doing desktop related tasks or playing music.
But you then have the hassle of files being spread over multiple computers, and not being able to access everything while only one system is on - I would probably end up in a position where I'm running both at the same time.
I'd rather have a chip that can reach both extremes - disable cores entirely and bring idle/light usage consumption down to NUC levels, but allow it to perform like a high-end desktop part when necessary.
Having more cores seems like the best solution for this.
That said, I don't know if trying to reduce power consumption is just a losing battle. I just bought a new DAC, and because there's a horrible pop through the speakers/headphone outputs when turning it on, I have been recommended to just leave the unit in standby - which is how it actually seems to have been designed to operate.
Being an American-made product, standby power consumption is 12W though! (up to 15W in operation, and 0.5W when "off")
-
That seems very low for a 2500K system. Mine is more than double that! It's overclocked, I do have eight drives inside it, and a GTX 570 though…
This is just a casual living room computer with no video card, a single SSD, no add-on cards, no overclocking, etc. I removed the fan from the power supply (I re-purposed a nice Seasonic that was over powered), so the only moving part is the 120mm CPU fan.
The LED monitor only takes a bit over 20 watts, so the whole thing is less than a 60 watt light bulb when on, and it sleeps down to a watt or two after a few minutes of inactivity. It's pretty neat for what it is.
(ps. To make sure I wasn't remembering wrong, I just measured again. It's about 29 watts for the machine and about 51 watts with the monitor on too. It uses a little more if you load the machine, so this is just staring at the desktop.)
-
Very nice - though it sounds like you might also want to look to switching over to something like a NUC if it's mostly running idle/playing music, and you're wanting to reduce your power consumption.
-
Desktop Haswell doesn't have all of the power optimizations they built into the Mobile versions (in particular, the new active idle power states).
Also, you should note... You cannot directly compare the TDP numbers between Haswell and pre-Haswell. That's because they substantially revised the power delivery system in Haswell, including moving the voltage regulator onto the die (Intel calls this the FIVR, Fully Integrated Voltage Regulator). This has a whole bunch of positive impacts (more independent voltage rails, dramatically quicker voltage ramps which allow the chips to much more finely tune power draw, etc), but it does also lead to a higher overall TDP number for the CPUs themselves as they took a component that was a separate chip on the motherboard (with a fairly large power dissipation requirement) and moved it to the CPU die itself.
In any case, desktop Haswell has around 25% better idle power usage than Ivy, and 11-12% worse usage at load (clock-for-clock), though the results of this depends almost entirely on how you test (http://techreport.com/review/24879/intel-core-i7-4770k-and-4950hq-haswell-processors-reviewed/7). Improvements on the mobile chips, particularly when you consider total platform power (especially on the chips that have the new GT3 GPU core and will no longer require discreet GPUs with their extremely power-hungry wide GDDR memory buses), will be much more dramatic. But, in the past, the "big money" improvements were all from ramping clockspeed, which isn't going to happen anymore. Also, consider that the release cycle is extremely compressed compared to the old days (Ivy was only a year ago). So, compared to the old days, you'd have to go back and look at improvements over something like Lynnfield. That doesn't look bad at all (http://techreport.com/review/24879/intel-core-i7-4770k-and-4950hq-haswell-processors-reviewed/14).
As far as performance...
It is about what I expected. A similar bump from Ivy as to what Ivy was to Sandy (10-20% or so, maybe averaging 13% across workloads clock-for-clock). This isn't shocking at all, really. Desktop isn't "important" anymore, and this architecture was absolutely focused on mobile. The huge gains are all in the GT3 GPU (which isn't even offered or available for desktop versions of Haswell), particularly those versions of the GT3 with the Crystalwell eDRAM L4 cache. It is a huge bummer that they didn't offer a desktop version with Crystalwell, because it is a general-purpose super-fast L4 cache (it caches CPU functions in addition to GPU functions).
I did notice this (http://www.anandtech.com/show/7003/the-haswell-review-intel-core-i74770k-i54560k-tested/6), though... Which applied directly to your original request in this thread:
Quite possibly the most surprising was just how consistent (and large) the performance improvements were in our Visual Studio 2012 compile test. With a 15% increase in performance vs. Ivy Bridge at the same frequencies, what we’re looking at here is the perfect example of Haswell’s IPC increases manifesting in a real-world benchmark.
A 15% IPC gain (clock-for-clock) is no small feat for something as "old" as a compile test. That's real money there. It is almost certainly the extra two execution units (http://www.realworldtech.com/haswell-cpu/4/) (though the front-end improvements likely helped here too).
But, if you want to see the real change in Haswell, you need to look at this article over here on GT3+Crystalwell (http://www.anandtech.com/show/6993/intel-iris-pro-5200-graphics-review-core-i74950hq-tested). That's where they spent their money and time. Along with the stuff we don't know yet because they haven't launched the general mobile parts yet (only the 4950HQ, which was almost certainly launched because we're getting new Macbook Pros at WWDC this week).
Overclocking...
Hmmm.... Well, the 4770K is fully unlocked (multiplier and bus) which is pretty cool. The on-package VRMs are worrying though, along with that higher TDP. We'll have to wait and see. Historically, the process refinement part of Intel's cycle is the "good ones" for overclocking, but putting all of that power stuff on the die... Not sure. Might take more than you can throw at it to cool it on air.
-
Yep, looks like those iVRMs are trouble in overclockland:
http://techreport.com/review/24889/haswell-overclocked-the-core-i7-4770k-at-4-7ghz
-
At this point I don't see anything good about Haswell. As a desktop chip is not worth it, a cool tech as an abstraction on paper, but a very expensive proposition in real life. As a mobile chip - it's not here, it has variations that would make even a math magician's head hurt trying to keep track and... did I mention is not here? What makes them think people will wait 6 more months to start getting Haswell NUCs or - whoa! - a Surface-like device with Haswell inside. And God forbid that thing doesn't stay charged a full day and doesn't have better graphics than whatever is the latest iPad.
After all, we've been talking about this for the last 6 months. Let's wait 6 more. And if not 2013, then Christmas 2014. I'm Daydream MacLeod, of the clan MacLeod... I can wait forever.
Next!
-
A 15% IPC gain (clock-for-clock) is no small feat for something as "old" as a compile test. That's real money there. It is almost certainly the extra two execution units (http://www.realworldtech.com/haswell-cpu/4/) (though the front-end improvements likely helped here too).
Still not seeing anything making it worth the cost of upgrading from my 2500K. I have a friend with a 3570K running at stock speeds and it is slower than my system.
I'm barely even pushing the chip either - no PLL overvolting, I have a negative voltage offset based on what the motherboard defaults to at 4.5GHz, and it is around 40℃ on air with an old Thermalright True 120 and an updated mounting kit.
It may as well have been 4.5GHz stock.
Yep, looks like those iVRMs are trouble in overclockland:
http://techreport.com/review/24889/haswell-overclocked-the-core-i7-4770k-at-4-7ghz
As expected—the integrated VRMs were another power-saving feature.
Seems stupid to have them on the desktop and restrict performance, when the desktop chips don’t include all the power-saving features of the mobile chips.
I wonder—does Haswell allow the chips to downclock more? One of the biggest things wasting power on my 2500K is that it will only ever downclock by 50%, and keeps all four cores active.
All this talk of improved mobile graphics seems disingenuous as well—the parts which offer a 2× performance boost over the previous generation also have close to double the TDP.
I know that it's supposed to replace the need for a dedicated GPU now, but I'm just not seeing the performance for that.
The one thing I was interested in with the upgrade, aside from performance, was being able to use QuickSync for transcoding—but I don’t know that Media Center supports it anyway, and reports suggest that they have started reducing quality for speed.
Oh, and SSD caching would be nice, but there are so many limitations with that, and Z67 motherboards are cheap now.
-
reports suggest that they have started reducing quality for speed.
That looks like a bug to me. Hopefully a driver bug.
I'm not a fan of the black-box of QuickSync anyway.
Seems stupid to have them on the desktop and restrict performance, when the desktop chips don’t include all the power-saving features of the mobile chips.
It provides some power benefits to the desktop too, and "costs" them only overclocking performance (which they don't care about very much). So, they threw the extreme guys a bone with the new frequency straps (which should let people with crazy-pants LN2 coolers push these to 8GHz or more) and more importantly...
They didn't have to develop/maintain two separate architectures for the handful of crazy enthusiasts running desktop systems way out of spec. These are quite-clearly consumer focused, and in the consumer market, it makes sense (when they better control the VRMs it also probably has a big impact on reliability and longevity because you aren't relying on "who knows what" VRMs the motherboard makers decided to use).
If you care about ultra-high-end desktop performance, they clearly want you on the E-series chips. In other words, pay us. But, frankly, the old Pentium EE chips used to always launch at $1k... The highest-cost Haswell isn't even close in price to that. They've just segmented the market differently now.
And, that, makes a whole lot of sense when you consider that everyone is buying laptops now. The desktop market is shrinking, and the lion's share of what's left is all corporate junk where reliability trumps speed every time.
-
As a mobile chip - it's not here, it has variations that would make even a math magician's head hurt trying to keep track and... did I mention is not here?
Oh, it's here.
Someone (https://developer.apple.com/wwdc/) will just be buying them all for a while (and hardly anyone else can afford to sell a bunch of high-end-only versions -- HP and Lenovo all want the low-end variants, which always launch delayed). Tune in next Monday.
-
Still not seeing anything making it worth the cost of upgrading from my 2500K.
I wouldn't upgrade from a Sandy either, but I wouldn't have done that with a two year old system anyway. If you're looking for something better than a desktop Sandy, then wait for Ivy-E in September, or the next Tick in summer 2014-ish.
It is a bummer they don't seem to overclock better (or, at least, it depends heavily on luck of the draw). Reminds me of the Pentium EE days, except of course those IPC gains were more like 2-5%. 20% IPC gains over Sandy in less than two years isn't too shabby at all. IPC gains are "hard", and never go backwards. So, any overclock you do get is improved by that percentage (so a Haswell at 4.5 is roughly == Sandy at 4.5*1.2). I tried to explain that earlier... Their horizon for IPC improvements is based on multiple cycles "working together". They fixed the front end in Sandy/Ivy, and then added extra execution ports in Haswell. Neither of these changes is massive across-the-board by themselves, but they fit together to make huge changes over time. Again, go back and compare them to Lynnfield or Nehalem, and remember that overall clock speeds have been pretty much static for much of that time.
If you're looking for a laptop or a desktop to run at stock, though... Haswell is a pretty big deal.
-
That looks like a bug to me. Hopefully a driver bug.
Hopefully.
I'm not a fan of the black-box of QuickSync anyway.
It would be nice to be able to transcode my Blu-rays in real-time for streaming to devices around the house that don't seem support their native format via DLNA - for example all the Sony TVs we have here.
They support up to 1080p H.264 and MPEG2 via USB, but I can't get Media Center to stream that via DLNA without having to transcode. Transcoding to MPEG2 looks awful, and H.264 eats up all of my CPU power so it doesn't stream smoothly, and I can't use the PC for anything else.
Being able to use QuickSync would look good enough for streaming to those TVs, and leave my CPU free so that I can use the PC at the same time as someone else streaming a film.
It provides some power benefits to the desktop too, and "costs" them only overclocking performance (which they don't care about very much). So, they threw the extreme guys a bone with the new frequency straps (which should let people with crazy-pants LN2 coolers push these to 8GHz or more) and more importantly...
I guess. Power saving seems minimal at best, from real-world testing so far.
They didn't have to develop/maintain two separate architectures for the handful of crazy enthusiasts running desktop systems way out of spec. These are quite-clearly consumer focused, and in the consumer market, it makes sense (when they better control the VRMs it also probably has a big impact on reliability and longevity because you aren't relying on "who knows what" VRMs the motherboard makers decided to use).
That's true I suppose, though most motherboards went far beyond what Intel specified for VRMs.
If you care about ultra-high-end desktop performance, they clearly want you on the E-series chips. In other words, pay us. But, frankly, the old Pentium EE chips used to always launch at $1k... The highest-cost Haswell isn't even close in price to that. They've just segmented the market differently now.
Yeah - I don't care about bleeding-edge desktop performance like that, I'm just looking for something comparable to my 2500K, which is "supposed" to be a 3.3GHz chip, but runs at 4.5GHz with ease. Anything above 4.5GHz or so is when you actually start to "push" the chip, and need to supply higher voltages and start using loud cooling solutions.
Similarly, the E5200 which I upgraded from was a 2.5GHz chip that could run at 3.5GHz using the stock Intel cooler.
And, that, makes a whole lot of sense when you consider that everyone is buying laptops now. The desktop market is shrinking, and the lion's share of what's left is all corporate junk where reliability trumps speed every time.
The funny thing is that I got out of buying laptops specifically because performance gains were so small, and power consumption wasn't improving in any meaningful way. It turned out that an iPad is actually enough to handle most of the "mobile" use that I got out of a laptop, and having a desktop PC is so much better for when I'm actually doing anything that requires CPU power.
That said, I wouldn't mind moving to something like an 11" MacBook Air or that Sony tablet Jim picked up for the flexibility x86 offers.
But it would need to have a real 10-hour battery life for that to happen, which seems years away. Currently they're advertising ~6 hours, but realistically you only get about 2 on the 11" Airs. (because doing anything is "high CPU usage" on those chips)
-
I agree QuickSync could be cool. But so far... I'm not sure.
My view is certainly "tainted" by my experience doing "real" transcoding. To me, it would make much more sense to get a dual Xeon CPU 16-core behemoth for those kinds of purposes and do software transcode. But, of course, money. But, like I said... From my point of view, you can now do on a single workstation what used to (only a few years ago) take a cluster of high-end machines running in parallel.
I guess. Power saving seems minimal at best, from real-world testing so far.
Power reliability was what I meant. But Haswell is a bit more power efficient too. Anand's article doesn't show it right. You have to look at task energy. If Idle power is much lower (and most PCs spend most of their time idling), but you have higher IPC (even at higher usage at load), you can "race" to the lower power state and still win overall in the amount of power used to complete a specific task (http://techreport.com/review/24879/intel-core-i7-4770k-and-4950hq-haswell-processors-reviewed/7).
Now, it'd have been WAY BETTER if they'd included the new sleep states in the desktop variants, which is mostly what the new on-die VRMs were for. The idea of the new sleep states is that the CPU can do "micro-sleep" all the time (shutting down even the VRMs themselves). So, the idea isn't that it does some task, but then sleeps when you stop using it. The idea is that it sleeps for 2 seconds, or 50ms, here and there while you are using it. In between keypresses and the like.
That, plus the DRAM-backed display tech Intel is pushing could make a huge difference on laptops.
Also, the Airs I've played with got 4-6 hours real-world easily for normal usage. Not gaming, of course, and running Garage Band kills them (which is probably all GPU and memory), but for "regular stuff" that you'd do on an Air (web browsing, Office, etc) they were pretty good. For my money, though, I'd much rather have a 13" Retina Macbook Pro. But not with that crappy Ivy GPU.
But a 13" Retina Macbook Pro with a Haswell GT3+Crystalwell? That looks like it could be pretty darn interesting.
-
I agree with you on the iPad, though... That's why I wouldn't ever be very interested in an Air, and prefer to have a "real laptop" (mine's a Sandy 15" Macbook Pro) and an iPad. Execs and journalists who fly and type a lot like them though. My COO is a huge fan of his, but he spends his whole day in Office and email.
-
Power reliability was what I meant. But Haswell is a bit more power efficient too.
That all said... I agree, the power saving on the Desktop is decidedly ho-hum. It isn't nothing, as Anand's article makes it out to be (and they keep beating the crap out of AMD, little by little - how times have changed since Netburst), but... Meh.
I think this summary from Scott Wasson says it well:
On the desktop, the generational progress from Ivy Bridge to Haswell is fairly modest, as we've noted throughout our analysis. This chip doesn't even move the needle much on power efficiency in its socketed form. For those folks who already own a Sandy or Ivy Bridge-based system, there's probably not much reason to upgrade—unless, of course, your present system has become a serious time constraint. We did shave off 34 seconds when compiling Qt on the 4770K, after all, and we've illustrated that much larger speed gains are possible in floating-point intensive applications that make use of Haswell's FMA capability.
<snip>
With that said, Haswell's integrated graphics have made bigger strides than the CPU cores this time around. The HD 4600 IGP in the Core i7-4770K isn't quite a fast as the one in AMD's A10-5800K, but it comes perilously close to wiping out AMD's one consistent advantage in this class of chip. And the Iris Pro graphics solution in the Core i7-4950HQ not only wipes out that advantage but threatens low-end discrete mobile GPUs, as well.
Haswell's true mission is to squeeze into thinner and lighter laptops and tablets, where it can provide something close to a desktop-class user experience with all-day battery life. Much of the new technology developed to make that happen isn't present in the desktop versions of Haswell. That's fine, as far as it goes. Focusing on mobile applications surely makes good business sense at this point. We'll take what gains we can get on the desktop, where the user experience is already very satisfying, and we are very much looking forward to getting our hands on some Haswell-based mobile systems to see how much more of its promise this architecture can fulfill.
-
And, that, makes a whole lot of sense when you consider that everyone is buying laptops now. The desktop market is shrinking, and the lion's share of what's left is all corporate junk where reliability trumps speed every time.
I'm not great at predicting trends due to my personal weirdo-quotient (ie. I often like weird things). But isn't this sort of a self-fulfilling prophecy?
Effectively Intel is saying "People aren't excited about desktops so we're not going to release exciting products." Microsoft is saying "People aren't excited about Windows on the desktop so we're going to release a lousy desktop experience and stop releasing software that makes people want a desktop." It's a chicken and egg issue.
-
I agree QuickSync could be cool. But so far... I'm not sure.
My view is certainly "tainted" by my experience doing "real" transcoding. To me, it would make much more sense to get a dual Xeon CPU 16-core behemoth for those kinds of purposes and do software transcode. But, of course, money. But, like I said... From my point of view, you can now do on a single workstation what used to (only a few years ago) take a cluster of high-end machines running in parallel.
Well for me, the only time I would personally want transcoding is to view files on my iPad, which are not feature-length films, and on that display it's more about convenience than quality.
But JRemote doesn't currently support transcoding for video anyway.
QuickSync is probably better quality than anything my CPU can handle in real-time, and without the CPU load and power consumption that is associated it.
Streaming to the other TVs is not something I would make use of, because I only watch films on the main TV, which is hooked up to the PC via HDMI.
But it would be nice to have good enough quality, that doesn't have a performance impact for anyone else here that might want to use it on one of the other TVs.
Now, it'd have been WAY BETTER if they'd included the new sleep states in the desktop variants, which is mostly what the new on-die VRMs were for. The idea of the new sleep states is that the CPU can do "micro-sleep" all the time (shutting down even the VRMs themselves). So, the idea isn't that it does some task, but then sleeps when you stop using it. The idea is that it sleeps for 2 seconds, or 50ms, here and there while you are using it. In between keypresses and the like.
That's actually one of the things that had me excited about Haswell after reading this article some time ago (http://www.anandtech.com/show/6355/intels-haswell-architecture/3) - I didn't realise they only planned on it being available on the mobile chips.
Apple has actually been advertising similar things for years now - such as sleeping in-between every keystroke: http://www.apple.com/macbook-pro/environment/ (http://www.apple.com/macbook-pro/environment/)
I have to say though, while it may add up to save power (which is often why you get longer battery life in OS X rather than Windows) I can't stand to hear the CPU switching power states all the time in their notebooks. Those high-pitched whines drive me nuts.
That, plus the DRAM-backed display tech Intel is pushing could make a huge difference on laptops.
Yep.
Also, the Airs I've played with got 4-6 hours real-world easily for normal usage. Not gaming, of course, and running Garage Band kills them (which is probably all GPU and memory), but for "regular stuff" that you'd do on an Air (web browsing, Office, etc) they were pretty good.
Well I suppose it depends what your normal usage is. Most people I know with Airs are complaining that they only get 2-3 hours before the battery dies.
For my money, though, I'd much rather have a 13" Retina Macbook Pro. But not with that crappy Ivy GPU.
When most of my work is done on a desktop machine, and I'm used to the portability of an iPad, I'm not sure that I want something as big as that now. I don't know that I'd buy another MacBook Pro again anyway, because they're so expensive for the performance that you get from them. I'm always wanting to upgrade long before I've had my money's worth from them.
But a 13" Retina Macbook Pro with a Haswell GT3+Crystalwell? That looks like it could be pretty darn interesting.
Perhaps. It's a shame that while we went "Retina" on the iPhones, iPods, and iPads without a price penalty, going "retina" on the MacBooks is a big price increase.
I'm not great at predicting trends due to my personal weirdo-quotient (ie. I often like weird things). But isn't this sort of a self-fulfilling prophecy?
I don't think so. I don't know anyone that actually wants to own a desktop computer these days - at most they will consider an iMac if they need "a lot of power" and that's essentially a laptop with a big screen attached.
These days it's mostly gamers that are left buying desktop PCs - and they're moving towards smaller form factor systems that are no bigger than a full length video card.
I wouldn't mind one of them if it weren't for the noise and lack of storage options. With everything shifting towards smaller form factors and lower power consumption, I'm starting to regret buying a large tower (http://www.silverstonetek.com/product.php?pid=242) though.
Most people - if they want a computer at all now, and aren't satisfied with an iPad or even just an iPhone - want a laptop that they can use at a desk/table if necessary, but are mostly just using on their lap when sitting on the sofa, lying in bed, taking it with them to a café etc.
In fact, people that are only a few years younger than I am, are starting to use their laptops as their sole entertainment devices. I know a worrying number of people that are happy to carry their laptop around the house with them, and use it as their music system, streaming via Spotify or similar services. All video content is just streamed to it via Netflix. High fidelity is a completely foreign concept to most people under say 25. The most you are likely to find is people that are into headphones for fashion, sound isolation, or bass, more than they actually care about fidelity. (hence the popularity of Beats)
Now that's potentially good news for you, as it means more people are shifting towards computer-based audio playback, but most people opt to pay for a streaming service that costs roughly the price of purchasing a single album a month, than actually caring about owning music and having a "library" to manage.
And in some ways, I don't blame them. I personally hate having a huge library of physical discs that I have to store somewhere, and are likely to be surpassed in quality in a few years.
I feel sorry for people that had big VHS libraries, then had collections of hundreds if not thousands of DVDs, and now we have Blu-rays which are a significant improvement. And in a few years time we will likely have 4K and eventually 8K too.
I'm happy enough to purchase Blu-rays though, because the quality is generally very good, and while it may not stand up to native 4K/8K video, I think it should remain watchable for a long time.
Even a good DVD never really impressed me when they were the current thing - they're all full of MPEG2 artefacts, sharpening, noise reduction etc.
If you're paying for a streaming service, you got a free upgrade from SD to 720p, to 1080p, and beyond.
Now the baseline quality is not good enough for me yet, but I'm sure it will be eventually, and why not pay the equivalent of buying a single disc to access any film you want, instantly, even if that means you don't own it?
-
I'm not great at predicting trends due to my personal weirdo-quotient (ie. I often like weird things). But isn't this sort of a self-fulfilling prophecy?
Effectively Intel is saying "People aren't excited about desktops so we're not going to release exciting products." Microsoft is saying "People aren't excited about Windows on the desktop so we're going to release a lousy desktop experience and stop releasing software that makes people want a desktop." It's a chicken and egg issue.
Maybe somewhat (and I agree with you on Windows 8, largely). But...
The move to Laptops/Mobile has been a LONG time coming. The problem is that for the past few years, corporate desktop sales have been propping up the plummeting consumer desktop numbers. But corporations are switching now too. In my company, we're deploying way more laptops than desktops now, whereas just two or three years ago laptops used to be only deployed to "important people" (managers) and worker-bees who "needed them". The race-to-the-bottom in the laptop space (started by netbooks) made it economically viable to just use laptops everywhere now, even in corporate-land.
Mobile x86 CPUs got "good enough" for most office needs. Laptops got "cheap enough" that... Why not? Then the workers can be mobile, even if they don't need to be very often. And they're easier to tote around without an IT person if you need to move a worker from one desk to another. So, the bottom is falling out. I don't know that it is ever going to go back. For a while, the decline in desktops was in the "Problem Domain" (you can try to "fix" Problems). Any more, it is in the "Fact Domain".
Some users will always need "trucks". Intel has the workstation line for those. Their higher profit margins better fits their new more-niche status.
The one ray of hope is actually tablets, I think. As tablets (and convertibles) become more and more powerful and useful for "general computing tasks", I think it is possible we see a reversal in the next few years, where most consumers end up with a tablet or ultra-lightweight-portable (MacBook Air or wannabe), but want a big desktop at home for large storage and "sitting at a desk" computing needs. Things like the All-in-Ones (iMac and wannabes) and small form factor style desktops might, just might, make a comeback then.
But they'll all be built from mobile parts. Small and sleek. Good economies of scale.
Us?
We'll probably be buying Xeons.
-
We'll probably be buying Xeons.
No, this is the opposite of what I want. I absolutely hate that Apple have only ever offered the Mac Pro with Xeons. OK, they have clearly neglected that line, and unless something is announced this WWDC, it's fairly safe to say they have abandoned them altogether.
But I wish they offered a line using consumer CPUs rather than Xeons and ECC memory. I don't need server-grade hardware that doubles the cost of the machine with very little performance benefit. (in a single CPU configuration)
-
No, this is the opposite of what I want. I absolutely hate that Apple have only ever offered the Mac Pro with Xeons. OK, they have clearly neglected that line, and unless something is announced this WWDC, it's fairly safe to say they have abandoned them altogether.
But I wish they offered a line using consumer CPUs rather than Xeons and ECC memory. I don't need server-grade hardware that doubles the cost of the machine with very little performance benefit. (in a single CPU configuration)
You assume that Intel will still be making socketed consumer grade CPUs in three or four years. I'm not so sure. I wouldn't be surprised if they go all-mobile in the consumer line in 3-5 years. I think there will always be a market for people who want high-end, "who cares about TDP, put her to the wall" performance in a desktop box. They're just going to all be "workstation class".
You might not have been one of the "us" to whom I was referring, though. ;) ;D
Also, I'm pretty sure the new Mac Pros are NOT coming at WWDC, but that doesn't mean they're abandoned. Something is coming, but it won't be now.
They aren't doing anything with that line until Ivy Bridge E arrives, and that won't be till September. You can make darn-good guesses about Apple's hardware release cycles by just watching Intel's release cycles (at least for their x86-based products).
-
Also, I'm pretty sure the new Mac Pros are NOT coming at WWDC, but that doesn't mean they're abandoned. Something is coming, but it won't be now.
They aren't doing anything with that line until Ivy Bridge E arrives, and that won't be till September.
Oh, I forgot that wasn't ready yet. Apple do sometimes get the chips early, but probably not that early.
Are you so sure that there will be a new machine though? People said the same thing about Sandy Bridge E.
-
No, this is the opposite of what I want.
I should add...
I agree with you, generally (that's why I'm able to run a consumer-grade Ivy in my "server" at home). There's just not enough of us, and if we can't "piggyback" on the economies of scale from Grandmas and corporate boxes... We turn into a niche, and we pay niche pricing.
ECC, however, is good. It is the one thing in Xeons that I wish I had in my machines at home. It is a little slower, but that's a good tradeoff for everyone but crazy LN2 overclockers. I wish they'd switched the consumer line over to ECC a long, long time ago so that we could all be benefiting from their economies of scale. But... I honestly think the enthusiasts would have screamed, so they never did it when it'd have made sense back in the Netburst or early-on Core days. It is sad.
As densities and switching speeds go up and up, the tiny, infrequent errors become more and more troublesome. You don't know it, but your 8-16GB of RAM is throwing errors out with regularity (also, cosmic rays).
-
Are you so sure that there will be a new machine though? People said the same thing about Sandy Bridge E.
Tim Cook all but announced it at their last earnings call.
EDIT: Here's what I'm tea-leaf reading here:
Last year, after the kerfuffle from them not shipping Sandy Bridge E Mac Pros, he said (http://appleinsider.com/articles/12/06/12/tim_cook_confirms_updated_mac_pro_coming_in_2013):
Thanks for your email. Our Pro customers like you are really important to us. Although we didn’t have a chance to talk about a new Mac Pro at today’s event, don’t worry as we’re working on something really great for later next year. We also updated the current model today.
Then, at the earnings call, he strongly hinted that most of the exciting new stuff would be coming in the fall. This means, certainly, the iOS devices (and whatever "new category" device they've been mumbling about), but... If you read the Q&A from the Earnings Call, I think it was more than that. I think it was also supposed to be a hint that the Mac Pro replacement (and it'll be a replacement, not an update) is coming around the same time too.
Which times quite well with Ivy-E. So... ::)
I bet they'd have preferred to launch them now, and were probably pushing on Intel to give them early access or speed up the ramp or something, but after the Earnings Call, I figured that failed.
-
Oh, I forgot that wasn't ready yet. Apple do sometimes get the chips early, but probably not that early.
And, right.
They're the biggest high-end player now, so Intel does all sorts of things for them they don't do for others (just watch how long it takes for other companies to ship GT3+Crystalwell in volume, I bet). But, I don't think it'll be that early. Maybe. If so, then Intel really, really needs them.
-
You don't know it, but your 8-16GB of RAM is throwing errors out with regularity (also, cosmic rays).
Not saying your wrong, don't get me wrong but I like to understand what you're saying here. Oh, and its offtopic too :P. How do you know this? Why, when I run memtest86 for several cycles for 24 hours or longer, does it come out without a single error?
-
Run it for three months. It'll happen (http://www.zdnet.com/blog/storage/dram-error-rates-nightmare-on-dimm-street/638).*
Because quantum mechanics is weird. Also, cosmic rays.
* And these numbers are old. As I mentioned, as switching speeds and densities increase, error rates go up logarithmically. That particular study found the opposite, but I've seen other more recent ones showing faster DDR3 DRAM doing worse. And, higher densities are the biggest enemy. More data == more places for a single bit to flip, which because of quantum mechanics, is going to happen some fraction of the time.