Data:
Sample (ie input) Textures are 128x128 jpeg images
(the first six input images are from the project website. the seventh
image is a resized, cropped picture I took of the Chester pool during
heavy rains on Monday using my Canon s45)
Output Textures are 512x512 png images
Code:
I used Corona to input and
output images. The code was compiled using Visual C++ 6.0 running under
Windows XP Pro SP2 on a P4 2.26Ghz Northwood with 512MB DDR333 RAM. The
code is downloadable here. The runtimes were
output to file. The runtimes for exhaustive search are here
and the runtimes for candidate pixel optimization are here.
The runtimes increase over time, since the runs are made back-to-back and I was too lazy to clear
the memory and whatnot.
Analysis:
Comparing runtimes to that quoted by Wei and Levoy:
360 sec for 200x200 texture using a 5x5 window (195Mhz R10000)
180 sec for 512x512 texture using a 5x5 window (2266Mhz P4 Northwood)
We expect the 512x512 texture to take (512x512)/(200x200) = 6.5536 longer
to run on the same processor.
This means that my runtimes are ~13 times faster, which is pretty good
when reviewing other benchmarks comparing p4's and the R10000, and also considering p4's long pipeline.
I did generate texture windows for each pixel of the texture and stored
that in memory to compare against the neighborhoods in the sample. This
was so that the memory to be accessed was a significantly smaller array
(5x3 - 9x5) than the whole texture array (512x512). Presumable,
optimizations would be able to place this in the CPU L2 cache, which is
512k, instead of RAM. The texture array would not be able to fit in the
cache. I have yet to see what would happen if I ran my code on a processor that has 2mb of cache, such as a G5, P4EE, or Banias (pentium m).
Results:
Click on thumbnails for full-size versions.