Skip to content

xref offset wrong in cross-reference stream #68

@g-tc

Description

@g-tc

The following is in regard to PDF v1.5 or later using cross-reference streams, I have not looked at how the issue affects xref tables in PDF v1.4 and earlier.

The cross-reference stream data includes an entry for the cross-reference stream object itself, but the offset being recorded by SynPDF is not correct. The offset is redundant information, any reader is only reading the stream because they found it from the offset in the trailer, and almost no software cares about it. But in the last 12 months or so PDF-XChange from Tracker Software has been reporting: "Error loading object: misaligned object". The load proceeds and PDF-XChange reports having automatically corrected the document, so it's not a big deal, but it worries users (and it annoys to developers who want to know their PDFs are valid).

A typical PDF from SynPDF outputs xref data something like this heavily cut down example:

0: 0 0000 ff
1: 2 0007 00
2: 2 0007 01
3: 2 0007 02
4: 2 0007 03
5: 1 000f 00
6: 1 0042 00    << xref object
7: 1 0042 00    << metadata object

Notice the last two objects have the same offset of 0042. That is the result of this code:

procedure TPdfDocument.SaveToStreamDirectEnd;
[...]
    for i := 1 to FXref.ItemCount-1 do
      with FXref.Items[i] do
      if ByteOffset<=0 then begin
        fByteOffset := fSaveToStreamWriter.Position;
        if Value<>FTrailer.FCrossReference then
          Value.WriteValueTo(fSaveToStreamWriter);
      end;
[...]

That loop means that the xref object for FCrossReference ends up with the same fByteOffset as the next object in the collection. We can't give it the correct offset there because we don't know the correct offset yet.

The simplest fix I could see was in the follow functioning, inserting the three lines you see marked...

procedure TPdfTrailer.WriteTo(var W: TPdfWrite);
[...]
    for i := 1 to FXRef.ItemCount-1 do
    with FXRef.Items[i] do begin
      if ObjectStreamIndex>=0 then begin
        assert(FObjectStream<>nil);
        WR.AddIntegerBin(ord(xrefInUseCompressed),TYPEWIDTH);
        WR.AddIntegerBin(FObjectStream.ObjectNumber,offsetWidth);
        WR.AddIntegerBin(ObjectStreamIndex,genWidth);
      end else begin
        WR.AddIntegerBin(ord(xrefInUse),TYPEWIDTH);
        if Value = FCrossReference then              //<<<< INSERT special handling
          WR.AddIntegerBin(FXrefAddress,offsetWidth) //<<<< for Cross Reference enty
        else                                         //<<<< needs actual offset
          WR.AddIntegerBin(ByteOffset,offsetWidth);
        WR.AddIntegerBin(GenerationNumber,genWidth);
      end;
    end;
    FCrossReference.WriteValueTo(W);
  end;
  W.Add(CRLF + 'startxref' + CRLF).Add(FXrefAddress).
    Add(CRLF + '%%EOF' + CRLF);
end;

But you may have a better option.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions