Skip to Main Content

Library Guides

Finding and Using Digital Archives: Why don't archivists digitise everything?

This guide covers how to find digitised and digital archives material and how to critically examine what you find.

Why don't archivists digitise everything?

This has to be the question that archivists are asked most regularly! Digitisation is a time-consuming and costly process and so a decision to digitise a collection - or part of a collection  - is never undertaken lightly. It is important for you as a researcher to understand how these decisions are made, so that you have an idea of what type of records might be available digitally - and what records might be missing.

Reasons to digitise

1. To preserve fragile materials

Archivists have to constantly balance the preservation needs of our collections with their access needs. There is no point keeping things that no-one can look at, but every time an archive item is handled, it is put at risk. Very fragile items are therefore sometimes digitised so that researchers can continue to access a digital copy without damaging the original.

2. To preserve materials expected to have high use 

Similarly, some documents in archives that aren't particular fragile but are consulted very regularly - such as parish registers - are often copied in order to preserve them. Access is then provided via surrogate copy - either digitally or printed out to maintain the experience of leafing through the volume.

3. To enable full text searching 

Sometimes a process called Optical Character Recognition is applied to digitised archives, to make them text searchable. This is most regularly used with published items like historic newspapers. This makes the digitised archives easier to use than the physical item and enables researchers to carry out analysis using techniques such as text mining.

4. To enable re-use in new contexts 

Archives - especially photographs - may be digitised in order to make them available for publication. This may be done to order (for example if a researcher would like to use a particular image in a book) or across an entire collection. This sometimes opens up a revenue stream for the archive.

5. To bring together dispersed collections 

Although the ideal situation is for one institution to hold all the papers relating to an individual, family or organisation, this is rarely the case. Correspondence, for example, will usually be split between the archives of the sender and the recipient. Collections may also become dispersed through emigration, family break-up or business acquisition. Archives are therefore sometimes digitised to bring these dispersed collections together in one place, to create a research environment that you couldn't have in the physical world.

6. To enable digital access to material that may not be physically accessible

Archives form an important part of personal, family, national and institutional identity. It is therefore unsurprising that many people choose to keep them close to their community, even when that community is geographically remote. Archives are therefore sometimes digitised to enable access to researchers who may otherwise have to make a long and difficult journey.

Conversely, an archive may be kept somewhere accessible but have relevance to people who are not easily able to travel to it. The archive may therefore be digitised to provide access to that community.

7. To promote collections

With most research beginning online, archivists are aware that digitising their collections can help researchers to find them. However digitising an entire collection is very expensive. Archivists may therefore digitise some highlights of the collection to promote it, with the expectation that researchers will come to the archive to do their actual research.

Reasons not to digitise

1. Cost/time

Digitisation is a very labour-intensive and expensive process. It involves either the acquisition of special equipment and training people to carry out the work, or paying an external firm to carry it out for you. Once the archive has received the digital images, they need to have metadata added to them in order for them to be usable by researchers, and the digital images need to be stored securely. Digitisation is usually carried out at the highest resolution available at that time, to future proof it for all possible uses. This means very large images, and resulting high storage costs. To ensure the images do not degrade over time, digital preservation techniques also need to be applied to the images. This means that archivists are effectively now committing to preserve both the physical and digital artefact, and so doubling their workload!

2. Low anticipated use

Archivists do not keep material that they believe will never be looked at by anyone; however not all material is equally well-used. The digitisation priority is therefore usually the material which has been the most popular in the past (see point 2 under 'Reasons to digitise'). Sometimes this can be a sensible decision but sometimes it can become a self-fulfilling prophecy. Use-driven digitisation prioritises the archive's existing user base and may lead to some communities being unaware that an archive holds material relevant to them. 

3. The physical qualities of the item can’t be adequately captured through digitization

Material prioritised for digitisation is usually flat and easily scannable. Archives may be reluctant to digitise items that pose problems for these processes, not least because it is likely to involve a greater cost. Examples of the type of objects that might not be digitised are 3-D objects, volumes with multiple flaps and inserts, documents with sticky notes attached, reflective items (including photographs with silvering).

4. Copyright

The copyright of items in an archive can often be difficult to determine. Published and unpublished documents, photographs and 'works of art' all have different copyright periods, and it is not always clear who the creator of an archive item is. For this reason, many archives focus their digitisation resources on older material which are likely to be out of copyright. 

5. Confidential/personal data

As with copyright, determining the risks around confidential and personal data can be very difficult with archives. Sometimes the nature of a document will alert you to its risks - for example, staffing or medical records are very likely to contain personal data - but other times it can come as a surprise. Business correspondence can sometimes include information about the writer's family, or a potential libel about a competitor! The only way that archivists would know this would be to read through all the files before they are digitised, which would be a huge task. As with copyright, it is therefore less risky to focus on older material.

6. Protection of reputation 

Although the public have a legal right to access some governmental archives, most organisations can choose whether to make their archives accessible and to whom. An organisation who are digitising part of their own archives are likely to focus on documents that make them look good, rather than on documents that may damage their reputation.

Further reading

If you're interested in understanding more about the decision-making behind digitisation, you may find some of the following useful: