What about hardware acceleration via Quicksync? Does that work when using an external GPU or just when using the iGPU?
No, you have to be using the CPU's GPU.
However, Lucid Logix makes an
application called Virtu that virtualizes your GPUs and allows you to use a discreet GPU and Intel's Quicksync simultaneously. Here's
a first look, and here's
a little follow up.
However, just to be clear, Intel's QuickSync is a GPU accelerated
video transcoding engine. It is, essentially, a hardware implementation of H.264. It does not accelerate video decoding at all (though DXVA can with supported filters).
Regarding GDDR5 cards versus a larger DDR3 frame buffer... The situation isn't always 100% clear, but a good rule of thumb for these kind of low-end cards is go for memory bandwidth and latency (speed) before size.
Mostly the large frame buffers (above 1GB certainly) are for storing high-resolution textures for gaming at 2550x1440 and above.
Low end GPUs are no where near powerful enough to drive
any modern highly textured game fast enough at those kinds of resolutions, so it isn't needed there. Larger memory sizes also can come into play when you are doing GPGPU tasks on very large data sets. The on-card RAM is essentially a giant cache. It is "closer to" the GPU, and usually operating at a much higher bandwidth than the onboard RAM (unless it comes with DDR3), and it doesn't have to go through the OS to allocate it, so it is generally faster than paging out to system RAM. However, the point of this is to keep the data pipeline in the GPU shaders filled, so the GPU doesn't spend time waiting for data to come from system RAM. So, it fills the buffers as it goes from system RAM, but if your GPU is fast, it can eat the data up faster than the system RAM can fill the buffer.
Low-end GPUs have so few shader units, and are so (relatively) slow, that they'll never burn through 512MB or 1GB of on-board RAM before the buffer can be trickle-refilled from the system RAM, which is also darn-fast in a modern Sandy Bridge CPU. They aren't fast enough at most tasks (except maybe contrived synthetic benchmarks) for it to matter. That's why in the old days they used to share system RAM. But RAM is so cheap now in volume that shady graphics card OEMs can slap 2GB on a puny card and sell it for much more than they could the same card with 512MB of RAM. It is used as a marketing vehicle, not for performance.
I've explained elsewhere why judging a card purely based on GDDR5 vs GDDR4 vs DDR3 can be a large fallacy. What matters is not the RAM "type", but the data rate and latency. The data rate is a function of memory speed (bandwidth) and the GPU's memory controller width. If you take some GDDR5 chips running on 1 GPU that has a 256-bit memory bus, and compare it to the exact same chips running at the exact same "speed", but attached to a chip that has only a 128-bit bus, the latter will have literally 1/2 the over all data rate. Plus, GDDR5 RAM usually has
higher latency than DDR3 per-clock. It can run at such high clockspeeds that it, usually, still works out to be quicker in "real time" than the slower, lower-latency memory. But that only applies if the vendor is actually shipping high bandwidth GDDR5, and isn't buying bottom-of-the-bin RAM chips and slapping them on an old GPU with a tiny memory interface.
And, they do just that to sell them to people who don't understand the intricacies at higher margins.