Category: Blog

…said the Analyst to the EDRMS

Well, of course it does.

When considering a document scanning project, there are a plethora of technical settings which need to be examined.  I am constantly asked what impact scan resolution has on the file size and quality. For the purposes of this article I will focus exclusively on these two aspects and address other considerations in subsequent articles.  (The next article reviews what impact compression has on the image quality).

When scanning files, there is often a play-off between getting a document scanned at the highest possible resolution to provide the best visual quality, yet keeping the file size manageable.

File Size

Firstly, let’s start off by looking at how file size impacts usability.  To put it in unscientific terms, the faster an image appears on the screen – the better and in this context, smaller files open faster than larger files.  Conversely, frustration levels soar, if a user has to wait tens of seconds before an image appears (and even longer for larger files).  This problem may be exacerbated if a user is accessing a hosted solution rather than accessing files on their own computer.

Image Quality

On the other side of this play-off is; image quality.  The rule of thumb is: The higher the resolution – the better the quality. (In reality though, there are a number of examples where this is not so, but this too will be addressed in a subsequent article).

The State Archives lists in its recommendations to have archive scanning performed at 600 PPI (Pixels per inch).  Note: this is a recommendation and not a standard.  The guidelines go on to suggest the resolution can be adjusted to ensure the image is “fit for purpose”.  This suggests the resolution can be adjusted to the appropriate level for the particular document type and circumstances)

The best way to illustrate the effect resolution has on Image Size, when document scanning, is via an example.  I have taken a typical single A4 page of content, scanned it at varying resolutions (both Black & White and Colour). The relative sizes of the documents are listed in the table below:

Resolution (PPI)Uncompressed
Black and White Size (KB)
Colour Size (KB)
20047411 312
3001 06625 380
4001 89245 162
6004 257101 597

This clearly demonstrates that file size almost doubles every time the resolution increases by 100 PPI.  A typical multi-page PDF document consisting of 30 – 40 double sided pages therefore varies quite significantly in size (even when compression is factored in) when comparing a low, to a high resolution scan.  Not only will accessing a large file present frustration, you may well have the IT department up in arms over significant storage space requirements and network traffic bottlenecks.

Output Quality

Then next aspect of the document scanning process to review is the quality of the output and compare differences.  I have taken a screen shot of the same snippet of the document for the resolutions: 200, 300 and 600 PPI respectively (Don’t be concerned about the content.  The snippets of each document are merely to demonstrate the relative quality of result).  I have specifically used a document containing a pattern as it is where the patterns intersect where changing quality is best observed.

Resolution (PPI)B&W ImageColour Image
200200PPI B&W Sample Image200PPI Colour Sample Image
300300PPI B&W Sample Image300PPI Colour Sample Image
600600PPI B&W Sample Image600PPI Colour Sample Image

From the samples, we can see that a high resolution scan creates a crisper clearer image.  This difference is best noted between 200 PPI and the others, but  is less obvious to the naked eye between 300PPI and 600PPI.  On face value, it seems that the slightly higher quality we get with the 600 PPI image, may not necessarily be sufficient to justify creating files which are more than four times the size of the 300 PPI image.  For colour images the impact is amplified.

Fit For Purpose

Now we know the impact resolution has on file size and image quality, the next action item is to define how “Fit for Purpose” applies to your document scanning project. [Refer to page 6 of the Digitisation Disposal Policy –  Queensland State Archives]

The best place to start answering this question is to examine the reason for initiating a document scanning project and the types of records involved.  This has the greatest impact on resolution settings.  If for example, you are bulk scanning financial documents, (Invoices etc.) which only have a retention period of 7 years with no real requirement above being a legible representation of the original, then scanning at 200 – 300 PPI Black & White, may well be sufficient.  This produces a usable image where the content can be read with confidence.  If the same images are intended to be OCR’d for data capture, then you would not go below 300 PPI as going under that would negatively impact the result.

If however, you are imaging a legal document (say), where the expectation is for the image to be as close a representation to the original as possible and the smallest detail is clearly visible, it follows that higher resolution colour may be needed. Even still, you would have to think that 600 PPI would be overkill.  An alternative approach would be to step up to 400 PPI, if there are compliance concerns regarding 300 PPI.

There is a concept called “Point of Diminishing Returns”. There comes a point with resolution where the higher the resolution scans, only makes a marginal difference to quality.  For the example used above, if the document had been scanned to 1200 PPI colour, the increased quality would be minimal but the price paid for file size would be dramatic. Note:  A situation where higher resolutions do make a difference is when scaling up the image, such as a photographic negative that is to be enlarged.  For this article, I am focussing on 1:1 scale document scanning.

Given that each project requires specific considerations around file size and resolution, it is difficult to make hard and fast recommendations to cover all scenarios. This article rather highlights the factors which need to be considered when document scanning.  More often than not, we get asked to perform bulk scanning on documents at 300 PPI (either B&W or Auto-Colour) as this provides a good balance between the size and the quality of output.

Odds are, you have witnessed or know somebody who works in a company that has implemented an overarching software solution which was supposed to be a cure-all for electronic records. But when it came down to specific business processes the end result was less than desirable.  Typically in these enterprise solutions the automation of key, end-user activities is lagging.

Promises Abound

Nearly every vendor – software & hardware – claims to have the answer to your prayers.  After all, business process automation is a hot topic right now and just about every software company and integrator is jumping on the bandwagon.

Despite the hoopla, what I find most confusing; is very few ‘solutions’ actually solve the problem.  Information ends up being manually checked or pushed down the line to become somebody else’s problem.  In other cases, it may be locked away in some sort of exclusive virtual vault, inaccessible to decision makers or the front line when it matters the most. More often than not, additional IT resources are employed to ‘manage’ the software, negating any resource benefits the organisation anticipated once ‘automation’ was achieved.

In the end, the organisation is locked into an expensive, incomplete investment. Front-line employees create workarounds and even worse your customers or suppliers are confused or frustrated by the need to provide the same information multiple times.

So, I ask myself where is the time, effort, and money going once the dust has settled on the initial implementation?

Some companies may act as if the problem does not still persist, while others see the light and decide automation is more than a catchphrase; it is a strategic imperative.  In doing so, they assign leaders such as you and they bring on experts to help lead the way.

It is common for companies to experience pain in attempting to make digital disruption through enterprise software solutions work for them. The good news is that there is a way to lessen the impact by taking several steps to address legacy issues that can cause problems.

To achieve automation for business unit functions that continue to lag, it is best to take a targeted approach that is not limited to the enterprise software constraints.  Instead, huge gains can be made when scoping a solution that compliments and integrates with core systems to take your performance to the next level.

1. Identify the processes that have the best chance to produce some quick wins. This is important for everyone involved to break the negativity that can sometimes surround a changing environment.  Bring hope to end users and senior management alike that real benefits can be enjoyed across all layers.  It can be important to understand that the status quo is very rarely the only possibility.  Your recent changes have improved certain aspects of the business, but in so doing, may have made others more painful. The best of both is achievable.

2. Get input from representatives from each of departments/locations/roles that interact with the documents or data during the process. From the point it first enters the organisation through to it’s final “completed” state in the enterprise solution. Understand all the challenges that that face the people involved in each of these touch points.

3. Form an in-depth understanding of the causes and impacts that incomplete or inaccurate data has on the process. What is the waste or risk implications?  Are people avoiding what should be done?  If so, why are they?

4. Identify how, where and when the data needs to be integrated into core systems. Is there a single source of truth? Are there multiple software systems that need to be synchronised so that during every decision-making step along the way people can trust their data?

5. Where does it all go wrong? What are the causes to breakdowns in the process? Things break! The most perfect process comes unstuck when a supplier/customer/contractor/staff member misses a key piece of information, makes a mistake or neglects what was meant to occur.

6. Weigh up the pros and cons of customising the ERP versus a third party integrated software solution. Consider the risk of budget blowouts and the impact to future upgrades.

7. From this grounding, you are now armed with the information to take positive action. What this plan may encompass will depend on the specific situation.  It may be any combination of people, process or technology.  It may require custom scripting or data modelling to allow for integration in areas that cannot be integrated.

8. Learn to look beyond the clever marketing of your vendors. The trick is being armed with the full knowledge of what the solution needs to address and having a much clearer picture of the outcomes software automation must achieve. In this way, you gain control of the outcome and are not left at the mercy of professional sales people bedazzling you with the exciting story of software robots or other fluff.

As this space can be highly detailed with many variances, I am only just scratching the surface, but rest assured there is a lot more in my head than I’ll ever get down on paper.  I’m always available for a chat or a coffee to help in a more useful way.

Our only Competitive Advantage is to learn faster than our competitors

Yes, for those of you who recognise the quote, I borrowed this title from Arie de Geus (Strategist – Shell Oil) and while it may be over 20 years old, the sentiment applies more now than ever. With a plethora of agile tools available to business, the lead time for any competitor to match or better New Product Design has reduced from what was traditionally years to mere weeks. So in reality to gain competitive advantage through innovation, all that sets us apart from any competitor is for us to learn faster and in turn, implement products, processes or solutions which reflect what we have learned.

This being the case, it stands to reason that a starting point for innovation is to first have comprehensive knowledge of your business’ current position before departing on future initiatives. What can be learned from existing methodologies, processes, experiences and trends and how can these be relied upon as the foundation for change. The decisions made from what you have learned will have significant impact on success.

The cornerstone of Quality Decision Making within an organisation is the quality of data upon which decisions are built. Data integrity plays such a critical role in the organisation, it follows that Data Governance should hold its place as “first among equals” in the spectrum of business tools available to strategists and decision makers.

Because Data Governance matters, best results are achieved through knowledge based on the broadest possible spectrum of data – not just the bottom line produced by the ERP or other core system. In order to gain the advantage of your entire knowledge within the business it is necessary to bring together data sets from all aspects of paper, digital and activities. Data Governance emanates out of the right tools and a defined set of procedures, specifically focussed on achieving these outcomes:
– Increased confidence in decision making
– Decreased Risk
– Better planning and strategising
– Faster identification of improvement areas
– Better staff effectiveness

In other words; the ability for an organisation to Learn all there is, from end to end, elevates the success of future innovation – whatever form it may take.