Project

Profile

Help

.NET Define line endings of created files

Added by Christian Merkel almost 6 years ago

First of all my question relates to the .NET version of Saxon.

I'm currently setting up the transformation kind of like this:

var processor = new Processor();
// (...)

FileInfo outputFile = new FileInfo("<path>");
StreamWriter outStream = outputFile.CreateText();
outStream.NewLine = "\r\n";

Serializer serializer = processor.NewSerializer(outStream);

// (...)
transformer.Run(serializer);
// (...)

Unfortunetly the output file is sometimes written with CR LF and sometimes only with LF. So the NewLine property is ignored.

How is it possible to force the line endings to be of a specific format?


Replies (10)

Please register to reply

RE: .NET Define line endings of created files - Added by Michael Kay almost 6 years ago

Have you tried setting the saxon:newline serialization property on the Serializer? I see there isn't a specific constant defined for it but it should be possible to pass it as Serializer.SetOutputProperty(new QName("http://saxon.sf.net/", "newline"), "\r\n"))

RE: .NET Define line endings of created files - Added by Christian Merkel almost 6 years ago

Setting this property doesn't throw any exception but also doesn't change anything on the output. It's still sometimes in UNIX format instead of Windows.

RE: .NET Define line endings of created files - Added by Christian Merkel almost 6 years ago

I took a look at the Serializer code from metadata and found these static QNames defined:

public static readonly QName METHOD;
public static readonly QName SAXON_REQUIRE_WELL_FORMED;
public static readonly QName NEXT_IN_CHAIN;
public static readonly QName SAXON_SUPPRESS_INDENTATION;
public static readonly QName SAXON_DOUBLE_SPACE;
public static readonly QName SAXON_CHARACTER_REPRESENTATION;
public static readonly QName VERSION;
public static readonly QName USE_CHARACTER_MAPS;
public static readonly QName UNDECLARE_PREFIXES;
public static readonly QName SUPPRESS_INDENTATION;
public static readonly QName STANDALONE;
public static readonly QName SAXON_INDENT_SPACES;
public static readonly QName NORMALIZATION_FORM;
public static readonly QName MEDIA_TYPE;
public static readonly QName INDENT;
public static readonly QName INCLUDE_CONTENT_TYPE;
public static readonly QName ESCAPE_URI_ATTRIBUTES;
public static readonly QName ENCODING;
public static readonly QName OMIT_XML_DECLARATION;
public static readonly QName DOCTYPE_SYSTEM;
public static readonly QName DOCTYPE_PUBLIC;
public static readonly QName CDATA_SECTION_ELEMENTS;
public static readonly QName BYTE_ORDER_MARK;

I tried setting serializer.SetOutputProperty(Serializer.OMIT_XML_DECLARATION, "yes"); which works as intended. The SAXON_NEWLINE property, exists in the docs but is missing in the defined QNames.

RE: .NET Define line endings of created files - Added by Michael Kay almost 6 years ago

The "sometimes" is puzzling, what are the conditions?

Please supply a repro showing exactly what you are doing so we can investigate.

RE: .NET Define line endings of created files - Added by Christian Merkel almost 6 years ago

I've attached you an example project containing a snipplet.

Forget about the "sometimes". The attached example produces always a file with UNIX line endings instead of Windows.

RE: .NET Define line endings of created files - Added by O'Neil Delpratt almost 6 years ago

Hi,

Thanks for reporting this issue. It is a bug which I am working on a fix. See bug issue #3828.

The problem is the default properties of the serializer are being picked up instead of those set by the user.

RE: .NET Define line endings of created files - Added by O'Neil Delpratt almost 6 years ago

Hi,

We think this is not actually a bug. The setting of line endings should only applied when the xsl:output method is set to 'text'. Also Saxon-PE or Saxon-EE is required to use Saxon extension properties.

RE: .NET Define line endings of created files - Added by Michael Kay almost 6 years ago

Note also the definition of the TextWriter.NewLine property:

The line terminator string is written to the text stream whenever one of the WriteLine methods is called.

The Saxon serializer does not use WriteLine to produce output, therefore this property will have no effect.

RE: .NET Define line endings of created files - Added by Christian Merkel almost 6 years ago

So you do not provide any possibility to change the line endings even in Saxon-PE and Saxon-EE if the xsl:output method is set to XML?

This forces us to either use a custom TextWriter or to reformat a already created file.

Line endings are a really basic feature. Are there any plan to implement a property to be able to specify line endings? or why don't you use the WriteLine methods?

RE: .NET Define line endings of created files - Added by Michael Kay almost 6 years ago

I know that saying "there's no demand for it" is never a very satisfactory response for someone who is demanding it, but it might be worth asking why this hasn't proved to be a major problem for other people. Obviously any conformant XML parser will accept either NL or CRLF, so presumably it's human readers you are concerned about. But I thought that on Windows most software that displays files will these days accept NL as equivalent to CRLF.

The serialization spec says:

When outputting a newline character in the instance of the data model, the serializer is free to represent it using any character sequence that will be normalized to a newline character by an XML parser, unless a specific mapping for the newline character is provided in a character map (see 11 Character Maps).

which (a) suggests character maps as another workaround, and (b) says that we could legitimately use a different line ending without breaching the spec.

The reason we don't use writeLine is that "lines" are simply not a logical unit in XML. Sometimes an XML file is all one one line.

    (1-10/10)

    Please register to reply