How to preserve an archive of PCX files

A reader calling themselves “Specious” commented recently: “Another lost function from both Word and the OS is the ability to import PCX files. These used to be ubiquitous graphics files. Although little-used today, there are thousands of graphics PCS libraries – and, of course, many original archive documents were scanned and saved as PCX files.

How to preserve an archive of PCX files

“It would be very useful if the PCX graphics format could be re-integrated into Windows Explorer (from XP onwards) and also directly importable into Word documents. While there have been a few claimed kludges, with directions to import legacy graphics filters from, say, an old version of Microsoft Paint, nothing works as it did up to 2003. Perhaps there’s a challenge here for PC Pro and its readers.”

The PCX graphic file format was invented in the early 1980s when the world was using MS-DOS PCs with CGA and EGA graphics adapters. It predates even the VGA graphics adapter. CGA, the first colour graphics adapter for PCs, could show only 16 different colours, while Superior EGA graphics could show a palette of up to 16 colours from a total of 64.

As many modern software packages don’t support the PCX format, a legacy collection of PCX files is rapidly becoming a liability rather than an asset

Which colour number (0-15) mapped to what actual colour had to be set up in the colour palette chip on the display card, and the PCX file format stored this palette information in a fixed-length header at the beginning of each file. If you tried to show two different images that employed different palettes on the same screen at the same time, you were often left with one image looking okay while the other had garbled colours because it was being shown with the wrong palette.

Finally, the VGA card was invented – which displayed 256 colours out of a possible 262,144 – and PCX format couldn’t accommodate this enlarged palette, so it had to be amended to store its palette information at the end of the file, and by adding a flag bit in the header that said whether or not there was a VGA palette present. This was inefficient because the computer had to read the file header, discover that the image had its palette at the end, jump to read the last 768 bytes of the file to retrieve the palette information, then jump back to the beginning to start reading the actual image data. If you just read the file straight through, such an image would appear totally garbled since it had a spurious palette, then it would suddenly change to display correctly once the real palette had been loaded.

Specious remembers correctly that for many years PCX files were ubiquitous. They were used to distribute a lot of clip-art, many scanners outputted as PCX files, and even some fax applications employed them. However, their main use was in the MS-DOS application PC Paintbrush (made by ZSoft Corporation) for which they were originally designed. This was a simple painting and image manipulation tool that came bundled with many early mice, to give people a reason for purchasing one.

PC Paintbrush was later adapted to run under Windows and so survived for a remarkably long time, as did its file format. However, the file format is now so antiquated and graphics technology has moved on so far that it really is time to consider trading it in for something better. While it might technically be possible to write an Add-in for Microsoft Word that enables you to import PCX files, it would take a lot of effort and would solve the problem only for Word. There are lots of other applications out there that can’t use this archaic format, so it would be more sensible to ditch the PCX file format by bulk converting your file archive to a more modern format that has more widespread support.
Modern image file formats can display 16 million colours, not just 256 chosen out of 262,144, and they do a far better job of compressing these images than does the feeble RLE (run-length encoding) algorithm employed in the PCX format. As many modern software packages don’t support the PCX format, a legacy collection of PCX files is rapidly becoming a liability rather than an asset, and I’d be inclined to convert any PCX files I have to a format such as PNG or TIFF.

Which format to convert to?

I wouldn’t convert them to JPEG because that’s a lossy format that throws away some information in order to make the file smaller, and once that information is lost you can’t ever retrieve it. Also, because PCX files often had to contain areas of dithering – adjacent dots of different colours used to simulate a colour that’s unobtainable in the current palette – converting them to JPEG could even make the files grow larger as JPEG doesn’t cope well with dithered colours.

I wouldn’t convert to BMP either, because that usually employs no compression at all and so makes for much larger files (256-colour BMP files may have RLE or Huffman compression, but it’s optional and so depends on the application being used).

GIF is another possible format to consider but it, too, is both old-fashioned and restrictive. It was invented back in 1987 and so is stuck with a palette of just 256 colours. It was also hobbled because for ten years (1994-2004) Unisys insisted that anyone who supported it within their application had to pay a licence fee to use the LZW (Lempel–Ziv–Welch) compression algorithm on which it relied. The PNG format was designed to replace GIF, and it has much greater colour depth (16 million colours), better transparency support (an 8-bit alpha channel, where GIF has only a single-bit alpha channel) and more efficient compression than GIF. The only feature PNG lacks compared to GIF is the ability to animate images. PNG is now the best lossless image format available for most applications, and is the default format used internally by Microsoft Office applications.

Most image-editing packages that can read PCX files can write PNG and so are able to convert files from one format to the other, but if you have many files in the old format you’ll probably want to use a batch conversion routine to convert all the files in one operation. There are many packages that can do this for you: IrfanView is one possibility that’s free for non-commercial use. Commercial users should buy a licence for €10 per user, but there are discounts available for bulk purchases.

Once you’ve converted all your PCX files, which takes only a few seconds per file, you can archive all the original PCX files to CD or DVD and then use the PNG files in their place. Most batch converters will happily search down through subfolders to find images to convert, and put the converted files alongside the original image or in another folder if you’d prefer. If you have tens of thousands of old PCX files you may need to leave the batch conversion running overnight, but so long as you don’t run out of disk space you’re unlikely to have any problems.

Inspect a few sample images to see how much bigger or smaller the converted files are than the originals, and then use that ratio to estimate how much extra storage space you’ll need. Depending on which conversion utility you use, it may stop if it encounters a corrupt file or may just log the problem and carry on converting. Unless you can find a good copy of any corrupt file from a backup, there probably isn’t much you can do about these (trying to fix the image header with a hex editor to retrieve a corrupted image usually isn’t worth the effort).

Disclaimer: Some pages on this site may include an affiliate link. This does not effect our editorial in any way.