ICC profiles and resolution in JP2: update on 2011 D-Lib paper

01 July 2013

It’s been more than two years now since I wrote my D-Lib paper JPEG 2000 for Long-term Preservation: JP2 as a Preservation Format. From time to time people ask me about the status of the issues that are mentioned in that paper, so here’s a long overdue update.


EPUB for archival preservation: an update

23 May 2013

Introduction

Last year (2012) the KB released a report on the suitability of the EPUB format for archival preservation. A substantial number of EPUB-related developments have happened since then, and as a result some of the report’s findings and conclusions have become outdated. This applies in particular to the observations on EPUB 3, and the support of EPUB by characterisation tools. This blog post provides an update to those findings. It addresses the following topics in particular:

  • Use of EPUB in scholarly publishing
  • Adoption and use of EPUB 3
  • EPUB 3 reader support
  • Support of EPUB by characterisation tools

In the following sections I will briefly summarise the main developments in each of these areas, after which I will wrap up things in a concluding section.


Adventures in Debian packaging

23 April 2013

About a year ago, work started on packaging SCAPE tools. Jpylyzer was the first SCAPE tool that was turned into a Debian package. Some time later, the OPF set up a couple of machine images at Amazon Web Services, which can be used to create packages repeatedly using a virtual machine. Even though I’ve used the Amazon service a couple of times myself, I really know next to nothing about Debian packages, and it’s safe to say that the underlying build process has been more or less a complete mystery to me.

To get a better understanding of the process for building Debian packages, I had a try at packaging jpylyzer on my local machine (which runs on Linux Mint 14). Some time ago Dave Tarrant and Rui Castro wrote a nice step-by-step guide on building Debian packages on the OPF Wiki, so I tried to follow the instructions there. While working on this, I made some notes, mainly to remind myself of what I was doing. Then I realised that some of this might be useful to others as well, so I decided to turn it into a blog post.


What do we mean by "embedded" files in PDF?

09 January 2013

The most important new feature of the recently released PDF/A-3 standard is that, unlike PDF/A-2 and PDF/A-1, it allows you to embed any file you like. Whether this is a good thing or not is the subject of some heated on-line discussions. But what do we actually mean by embedded files? As it turns out, the answer to this question isn’t as straightforward as you might think. One of the reasons for this is that in colloquial use we often talk about “embedded files” to describe the inclusion of any “non-text” element in a PDF (e.g. an image, a video or a file attachment). On the other hand, the word “embedded files” in the PDF standards (including PDF/A) refers to something much more specific, which is closely tied to PDF’s internal structure.


Identification of PDF preservation risks with Apache Preflight: a first impression

19 December 2012

The PDF format contains various features that may make it difficult to access content that is stored in this format in the long term. Examples include (but are not limited to):

  • Encryption features, which may either restrict some functionality (copying, printing) or make files inaccessible altogether.
  • Multimedia features (embedded multimedia objects may be subject to format obsolescence)
  • Reliance on external features (e.g. non-embedded fonts, or references to external documents)


Search

Tags

Archive

2022

June

April

March

2021

September

February

2020

September

June

April

March

February

2019

September

April

March

January

2018

July

April

2017

July

June

April

January

2016

December

April

March

2015

December

November

October

July

April

March

January

2014

December

November

October

September

August

January

2013

October

September

August

July

May

April

January

2012

December

September

August

July

June

April

January

2011

December

September

July

June

2010

December

Feeds

RSS

ATOM