I think everyone who has upgraded from a camera that has a resolution was just a little higher than the resolution of their desktop monitor to one that is much higher has had a disappointment in finding that the resulting new images just didn't seem much sharper. Or any sharper. Or as sharp. It's interesting to me because I've experienced this situation and I continue to experience it. I wish I understood it better.
The issue seems not to be so much about the actual number of pixels on the sensor of a camera but in how small the pixels are and how densely they are packed in. The idea is that denser sensors are prone to a quicker onset of a sharpness robbing effect known as diffraction. As I understand it diffraction, or bending of light around the edges of an opening of a lens (or a pixel array) is what actually causes primary sharpness issues but the overlapping of "Airy disks" is what lowers resolution on the sensor.
Here's an in-depth and well done article that I found on diffraction and its various effects: http://www.cambridgeincolour.com/tutorials/diffraction-photography.htm
When I look at the details of the "science" I can understand that diffraction makes images progressively less sharp after a certain point. There is a calculator (actually two) in the linked article that shows the effect of pixel density and sensor size on diffraction limits. It shows, theoretically, what the minimum aperture would be for a given sensor size and pixel density before diffraction rears its mathematical ugly head and starts causing problems vis-a-vis sharpness.
I used the calculator for several different camera sensor sizes and density. What I found was that on an APS-C sized sensor a system becomes "diffraction limited" (where sharpness starts to gradually decline---it's not on or off in a binary sense) based on the density of the pixel packing. A 24 megapixel sensor (like the one in my D7100) hits the wall at f5.9. If I use a D7000 with 16 megapixels instead the diffraction limit sets in at f7.3 and if I use a 12 megapixel camera the diffraction limit steps into the equation at f8.4.
If I use my micro four thirds cameras at 16 megapixels we become diffraction limited at f5.9 (the same as the APS-C at 24 megapixels....) and if we were able to wedge 24 megapixels into the next gen of m4:3rd sensor we'd see diffraction rear it's ever softening head at f4.8. Best case in the current market in respect to delayed onset of diffraction would be the Sony a7s at 12 megapixels. The calculation shows that lenses on that camera don't become limited until hitting f12.7.
The mind reels but essentially there's a fixed pattern that tells us you can have some stuff but not other stuff. If you are shooting with micro four thirds cameras of 16 megapixels it really behooves you to buy fast lenses that are well corrected wide open and at wider apertures. By the time you hit f5.6 you've almost got a foot in the optical quicksand. Stopping down to improve lens aberrations probably cancels out overall improvements with the advancing onset of diffraction.
So, the mind boggles even more. If I am shooting outside and want max depth of field the numbers tell me that I might be better off shooting with a less densely packed sensor camera. If I needed f11 to get sharp focus on a big bridge for example, I might be better off shooting on a 12 megapixel camera than a 24 megapixel camera. While the depth of field remains the same if the sensors are the same overall geometry the more densely packed sensor will succumb to unsharpness at lower f-stops. Now, theoretically if I resized the 24 megapixel image to the same size as the 12 megapixel file I'd get the same level of sharpness. At least that's what I gather. But there are so many other variables.
The optical detail transferred by our lenses is limited by the lenses ability to deliver sharply defined points. The lenses output quality has to do with something called, Airy Disks, that limit their ability to deliver more resolution beyond a certain point as well. The Airy disk is a 2D mathematical representation of a point of light as delivered by an optical system to film or to a sensor. As the pixels get smaller more of them are covered by the same single Airy disk delivered by the optical system. Additionally, when Airy disks overlap they loose their resolving abilities by a certain amount. Also their are different sub calculations for the different wavelengths of the different color spectra.
If the information represented by each Airy disk is spread over more and more smaller and smaller pixel there can be a reduction of sensor artifacts but it will be offset by the resolution limits of the actual lens. One of the reasons some lenses are brutally expensive is that the designers have opted to make their lenses as sharp as possible (or diffraction limited) wide open so that one doesn't need to stop down to get better lens performance. The old way of designing lenses (especially fast ones) was to do the best design you could and aim for highest sharpness two stops down from maximum aperture. You see that in most of the "nifty-fifty" inexpensive normal focal length lenses. Lots of aberrations along with unsharp corners and edges when used wide open but then shaping up nicely by f5.6. Now, with high density sensors, you'll start to find that f5.6 also might become your new minimum f-stop which, for all intents and purposes means that your mediocre (wide open) lens has only one usable f-stop. The one right before diffraction sets in.
When you overlay the idea of Airy Disks and their effect on resolution based on sensor size with the quicker mathematically implied diffraction effects of denser sensors you can see why an image from a lower density sensor might look better on screen at normal magnifications than the same lens used on the same scene but shot on a much higher resolution system. The difference is in acuity or perceived sharpness. Because at the diffraction limited point it's the edge effect that gets eroded. The contrast between tones is reduced which reduces our perception of the sharpness of the image.
What a weird conundrum but there it is. I started thinking about this when I started shooting a D7100 next to a D7000 and started finding the 7000 images (16 megapixel sensor) much sharper in appearance. At the pixel level the D7100 was sharper but on the screen the D7000 images were more appealing. And if the target is the screen then all of the theoretical information is just more noise.
There are really so many more things at work here than I understand when I compare images from different cameras. There are generational issues having to do with noise reduction and dynamic range that shift the results and out ideas of what constitutes "a good camera." But Sony has done something that seemed at the time driven by the needs of video but at the same time revelatory of what we can see when we strip away some other muddying factors which have served to make us want the higher megapixel cameras (=more DR and less overall noise). They recently introduced a full frame camera at 12 megapixels that combines the state of the art noise handling and beyond state of the art dynamic range on that sensor. Now the seat of the pants evaluation and the awarding of "best imaging" prizes to the highest megapixel cameras is called into question. It may be that there will be a trend back toward rational pixel density driven by the very need for quality that drove us in the other direction. They've changed the underlying quality of the sensors and that may allow us to go back to being able to stop down for sharpness and to skirt some of the constraints of the laws of physics as they apply to optical systems. And in the end benefit with both great looking files and far more flexibility in shooting and lens choice.
But, as I've said, I don't understand all the nuts and bolts of this and this article is an invitation for my smart readers to step in and flesh out the discussion with more facts and less conjecture. Have at it if you want to....