Why don't more companies resize images client-side first using <canvas> and then save the server some work by only asking it to verify the result by
- resizing to the same size
- removing metadata
This results in much faster transfer (10x less bandwidth used often for mobile uploads) and reduces server load by "farming" out the work to the clients.
# Edit: On Keeping Full Resolution Images
Some people mention having original highest-resolution images are important. I don't think that is true for most applications.
Most apps don't need hi-resolution history as much as current, live engagement so older photos being smaller isn't a big deal. As technology moves on you simply start allowing higher-res uploads. Youtube, facebook, and others have done this fine as the older stuff is replaced with the new/current/now() content.
In fact, even our highest resolution images are still low-quality for the future. Pick a good max size for your site (4k?) and resize everything down to that. In a year, bump it up to 6k, then 10k, etc...
Keeping costs low has it's benefits, especially for us startups. Now if you have massive collateral, then knock yourself out.
There is already an (unofficial Google) image proxy written in Go that is quite fast, does caching (local or backed by S3/GCS), and does other nice things like smart cropping: https://github.com/willnorris/imageproxy
Seemed like a lot of unnecessary work for them to reimplement a service from scratch without gaining any major perf benefits over their existing one and without leaning on an existing well-known and well-built foundation.
Link to the resulting open-source project:
I’d be very worried about a security issue with the unsafe C++ code.
You really have to run this kind of complex parsing in a disposable containerized environment to do it safely. Or do everything carefully and in a memory safe language.
How is the security? Any sort of image processing is a potential exploitation point. I see it says it uses the 'mature' libjpeg-turbo and libpng libraries,along with giflib for .gifs, but even with full trust of those, the C code, patches, and changes ontop could be more exploitation points. You can look through Imagemagick alone to see all the fun things possible when seemingly basic processing turns into exploits. https://www.cvedetails.com/vulnerability-list/vendor_id-1749...
> Today, Media Proxy operates with a median per-image resize of 25ms and a median total response latency of 85ms. It resizes more than 150 million images every day. Media Proxy runs on an autoscaled GCE group of n1-standard-16 host type, peaking at 12 instances on a typical day.
Did it seem to anyone else that sticking to Python would have been way easier? It didn’t seem like any of the performance gains were through Golang.
Anybody knows how well libvips https://github.com/DAddYE/vips compares to liliput performance wise?
Nice, but why? https://cloudinary.com, https://www.imgix.com, or https://www.filestack.com already exist and are well worth it for 99% of apps. Even at scale, it really doesn't cost that much to have someone else do it. You can use a thin proxy through your existing CDN if you want to save on their bandwidth fees.
Also http://thumbor.org and https://imageresizing.net if you want a library to host yourself which are already very fast and well tested. Put them in a docker container on a kubernetes cluster and it's all done in an hour.
This post reminded me of a very old article from Yahoo/Tumblr explaining how they were (ab)using Ceph to generate thumbnails on the fly as pictures were uploaded using the Ceph OSD plugin interface.
Unfortunately the post seems to have disappeared from the internet (it was probably around 6 years ago), so here are some other teasers:
Disclaimer: not affiliated with Ceph apart from being a happy sysadmin.
I wish Cloudfront supported resize parameters so we wouldn't have to keep buildings these or paying a lot for Imgix.
I wonder why people implement such things on CPU?
PCI express is ~100 gbit/sec, much faster than any network interface. Internally, a GPU can resize these images by an order of magnitude faster than that, see the fillrate columns in the GPU spec.
is there any open source project img proxy that can do this?
eg: instead of this
we can create alias like octo and url will become this
That’s 1700 images per second. Doable on one (beefy) box. 3 to account for the diurnal cycle. Am I supposed to be impressed?