Image compression — a basic overview (and whether you need to consider it when uploading to social media).
Image compression is very similar to most other kinds of file compression in principle. Essentially, it allows you to reduce the size of the image file in return for some processing time to perform the compression (and depending on the type of compression, you may lose some image quality).
As such there are two types of compression:
- Lossy compression: usually produces smaller filesizes, at the cost of reducing quality by some amount
- Lossless compression: doesn’t lose any quality, but isn’t as good at reducing filesizes as lossy compression is
The entire internet would function far less without compression, because both images and videos are absolutely gigantic in size without it. For example, your average size artwork (2480x3508 pixels) is usually above 24MB before you compress it, and bigger images can be several hundred megabytes or more. Imagine trying to download that on a slow connection. By contrast, a compressed average size image can be less than 1MB with zero or almost zero loss in noticeable image quality. How is this possible?
Encoding is the reason. To understand how encoding works, we can look at a simple example; there are many types of encoding, but here we will look at run-length encoding.
It’s a simple concept really: if for example we assume data is made up of letters only (A-Z), we can encode any series of letters as “this many of one letter, followed by this many of another letter…” etc. One example can be seen above.
Some other examples could be:
- “aaaaaabbttttbbbb” — 6a2b4t4b
- “ddddddddaaaaaaaatypppppp” — 8d8a1t1y6p
- “abcdefg” — 1a1b1c1d1e1f1g
As the last example shows, run-length encoding is often not effective if the data doesn’t contain large amounts of repeated data (and can produce larger data than the original).
Some types of image compression do this, but most involve transform coding. PAL and NTSC systems for TV color make use of this — effectively a way of using mathematics to intelligently compress data (and additionally, make it easier to discard data that is not deemed to be necessary, such as colours the human eye doesn’t really notice).
The most prevalent image format today is JPG, and for a very good reason: it is extremely good at what it does. Other image formats like PNG and TIF have their place, but JPG is an all-around excellent format, which makes use of Discrete Cosine Transform encoding, a form of transform encoding. The result is that an image can be massively reduced in filesize (easily over 90%) with almost zero noticeable loss in quality, even to someone who is zoomed in and actively looking for defects.
Tradeoff between encoding time and resulting filesize and quality
It’s important to note that for any given kind of file (e.g. a JPG file), there are many different encoding algorithms, and ways to set that algorithm.
For example, JPGs can be encoded with libjpeg, libjpeg-turbo, mozjpeg or Guetzli, with differing encoding times and results. This is because sometimes you want to optimise for encoding speed (e.g. when you want to encode an image on the fly, or don’t need amazingly high quality), and sometimes you want to optimise for the best possible resulting image.
JPGs can also be given a “quality level” — from 0 to 100 — denoting, in arbitrary terms, how much data to discard. A 100 quality JPEG is a big file (still significantly smaller than uncompressed, but big) in comparison to even a 90 quality JPEG. Note that a 100-quality JPEG is still lossy compression.
Another example would be videos. The most common video format these days is mp4, most of which are encoded with x264 (an older encoding algorithm that is fast to run). Currently, HEVC (x265) is slowly replacing it; it takes longer to encode but produces smaller videos at the same bitrate, which for most video purposes is definitely preferable.
Should you think about any of this when posting images to social media?
99.999% of the time, nobody is ever going to notice the compression if an image you upload is resized or compressed by Twitter or anywhere else. On the other hand, a lot of people on slow or mobile connections will scroll straight past your art if it’s taking ages to load because the filesize is 20x bigger than it needs to be.
As a result, it’s best not to try and get around image compression on websites. Not only is it counterproductive as explained above, but rules on what gets re-compressed and what doesn’t are notoriously fickle and ever-changing.