Hello veraPDF users/devs,
I have a question over XMP packets validation in PDF documents: how
does veraPDF actually does it? Can you point me to references in the
code where this is performed?
I elaborate more on the question, providing a viable solution as
suggested by ISO standards committee. I have access to both ISO
16684-1:2019[1] and ISO 16684-2:2014[2]: the first standard
generically describes the XMP packets data model and properties,
explaining the many alternative notations accepted (that make
difficult to validate the packets). The second suggests a method for
the normalization of the packets to create an unique representation of
the information stored in the packets (this method can be mostly
implemented without knowing anything about the actual XMP data
model/properties). It then supplies some sample RELAX NG schemas to
validate unspecified XMP demo packets. These schemas clearly aren't
describing the full schema to validate the XMP packets of PDF
documents since all pdf specific properties are missing but also some
generic ones. If the full schema to validate XMP packets in PDF
documents was publicly available, validate the XMP packets in PDF
documents would be quite simple as the normalization algorithm is not
very difficult to implement. I asked for it in[3] but I doubt Adobe will
release it. It may also not exist at all since 16684-2:2014
recommendations may have been developed independently from
use in any actual Adobe product.
Thank you in advance for any insight.
Regards,
Francesco
[1] https://www.iso.org/standard/75163.html
[2] https://www.iso.org/standard/57422.html
[3] https://github.com/adobe/xmp-docs/issues/20