NPIGG - Scanning recommendations
Andrew Miller provided me with a sample original ECG paper document (coloured graph area, black plotting), with the patient details scrubbed out. I scanned this in varous formats, various resolutions and colour depths to provide a proper basis for commentary and comparisons. The results are all held here and can be viewed in a folder listing and accessed directly: http://www.enigma.co.nz/files/ENIGMA/NPIGG/
A summary table of the following is available here: http://www.enigma.co.nz/files/ENIGMA/NPIGG/ScanningSummary.html
PLEASE NOTE, then if you view files from this site, when you view them your viewer may choose to show you the contents at a ZOOM level other than '100%' in order to be able to compare files of different types directly, please ensure that your view is set to 'Actual Size', or 100%. Also, different file types may invoke different viewers...
Please ignore the page rotation issues with these scans, all viewers are capable of rotating the viewed image, so I did not focus any effort on getting this correct. My testing was aimed more at quality issues than this. :)
All scans up to 300dpi appeared to take the same length of time to scan on my basic scanner. Scanning at 400dpi and 600dpi took noticably longer. - This will be a consideration for my specific scanner and may not apply in the same way to all scanners.
Results achieved at 300dpi were perfectly readable, anything higher than this for these kinds of documents was over-kill. We should therefore limit our recommendations to this as the highest required resolution for legibility at this point (for doucments), radiology images will be entirely different.
As discussed within NPIGG, the clarity of the scan is not dependant on the file type. It is due to the scanning settings used.
HOWEVER, TIFF images (.TIF) are uncompressed images. They are not like .JPG files which allow for an inbuilt method of compression to be applied to them. As such the size of these files, for the same resolution and quality tend to be larger than the equivelent PDF files. This is becuase PDF allows for the content contained within them to be compressed.
An image stored within a PDF may still typically be a .BMP (uncompressed) image, but the container (.PDF) allows for the image content to be transported with compression applied. The PDF reader then handles the decompression of the contained content, then renders the PDF document, and it's contents as it was formed (layout is contained too) for the recipient to view.
Considering other image file formats like .JPG may be worth a look, certainly over .TIF images they will be capable of conveying more image quality and detail for the same file size. When we get to colour images we will see that the .JPG file size is directly comparable to the .PDF file sizes for the same quality image.
When scanning is done in Black and White, there is only pure black or pure white allowed in the image. This means it forces contrast to either full black or full white, there is no middle ground. This means that subtleties within the image are lost. This is particularly damaging when colour is used within the original to differentiate details, like the graphing background in the ECG.
The benefit of Black and White is that because there are only two colours in the pallet used within the image, the filesizes are small in comparison to all other formats for scanning. Black and white is perfectly suitable for printed letter scanning and storing; especially if a reasonable resolution is used, and especially where the original letter was black text printed onto white paper. It is perfectly reasonable for someone who regularly scans documents to be expected to be able to change scanning settings to create appropriately scanned documents.
Taking an approach where people scan letters in B/W will dramatically reduce the overall storage required, if it is assumed that, say 25% or more of all scanned documents would be of this nature then it is worth encouraging B/W scanning of letters.
When scanning images which require the detail captured in greyscale or colour, TIFF files do not offer an appropriate format for capturing this informaiton purely due to the lack of compression offered. This creates files which are too large to reasonable store and send through interfaces which have maximum message sizes in the region of 2 or 5 MB. When this level of detail is required (ECGs etc), it is necessary to switch to a compressed image format, either .JPG, or to contain it within a document format which handles the compression (.PDF).
eg: TIFF Greyscale 300 dpi = 8.5MB vs PDF Greyscale 300 dpi = 1.6MB - and there is no noticable difference in the visible quality of the contents of the file. Colour TIFF 300 dpi is 23.6MB! - You can see how these non-compressed format have been seen to be completely unusable at these higher resolutions.
Comparing .JPG vs .TIF file formats (these are native image formats, without any document structure around them [no-PDF container] - the compression afforded within the JPG file brings the filesize of a colour scanned image at 150dpi down from 5.8MB in TIFF format to 558kB in JPG format. The JPG formatted 300dpi, full colour page is 1.99MB, under the 2MB limit which used to apply for HealthLink attachments.
Where colour background are used (ECG) to distinguish between the graph plotting area and the plotted lines, it is worth recommending a colour scan mode be used, especially if there are scanning options which make the filesizes comparable to their lesser quality, non-compressed counterparts.
The PDF file format manages to achieve better compression in general, than my particular scanner software yielded with JPG file formats. The JPG file format however does allow for various levels of compression within it, for a given scanned dpi it is possible to set the compression level within the file. It is possible to set the compression level so high that it becomes extremely 'lossy' in quality, this means that it would be impossible to set a 'standard' of scanning if we were to recommend JPG file formats and so, we cannot and should not recommend this.
Looking at the file sizes rendered from the various settings, I think we're safe to issue the following MINIMUM scanning guidelines:
Minimum Scanning Guidelines
Documents where background colours would enhance its interpretation:
Format: PDF format.
Scan Quality: 150dpi (minimum)
Colour mode: Full colour
SAMPLE File: PDF 150dpi Colour,
Should yeild a filesize of roughly <550kB per A4 page.
Preferred: PDF Colour, 300dpi = ~2Mb / A4 page.
For ECGs and other graph based content:
Do not use Black and White, use at minimum Greyscale, colour is preferred
Never use TIFF, you cannot create a sufficiently detailed TIFF image within a reasonable file size.
Use at lease 150dpi, 300dpi is strongly preferred.
Use colour scanning sparingly, it is often not required
Format: PDF format.
Scan Quality: 300dpi
Colour mode: Black / White (not greyscale)
SAMPLE File: PDF 300dpi BW,
Should yeild a filesize of roughly <100kB per A4 page.
** This is notably better than the greyscale option at 150dpi which gives a larger filesize =~ 250 to 500kB. **
General setting if users don't know how to use scanners well?
If there had to be a 'default' which could possibly be suitable for almost everything (apart from radiology?) the following might be reasonable for those users who don't know how to change settings to acheive better results:
Format: PDF format.
Scan Quality: 300dpi
Colour mode: Greyscale
SAMPLE File: (ECG) 300 dpi GreyScale
Should yield a filesize of roughly 1.6MB per A4 page.
Even the ECG was perfectly readable using this setting.