Project

Profile

Help

Question on Validation Message of missing data element

Added by Radar Lei over 5 years ago

Hi Experts,

Could you please confirm if my understaind is correct to the validation message for missing data element?

  • The validaiton message will search the first missing 'Mandatory' field in the to-be validated xml file, and it also show all the 'dependent' or 'optional' data element before this mising 'Mandatory' field after the last filled field.

Following is my example result: In content of element <C_C080>: The content is incomplete. The following elements would be valid here, all in no namespace: D_3036_5, D_3036_4.

I have an XSD definition as following:

<xsd:element maxOccurs="1" minOccurs="1" name="C_C080">
                                <xsd:complexType>
                                  <xsd:sequence>
                                    <xsd:element maxOccurs="1" minOccurs="1" name="D_3036">
                                      <xsd:simpleType>
                                        <xsd:restriction base="xsd:string">
                                          <xsd:maxLength value="70"/>
                                          <xsd:minLength value="1"/>
                                        </xsd:restriction>
                                      </xsd:simpleType>
                                    </xsd:element>
                                    <xsd:element maxOccurs="1" minOccurs="0" name="D_3036_2">
                                      <xsd:simpleType>
                                        <xsd:restriction base="xsd:string">
                                          <xsd:maxLength value="70"/>
                                          <xsd:minLength value="1"/>
                                        </xsd:restriction>
                                      </xsd:simpleType>
                                    </xsd:element>
                                    <xsd:element maxOccurs="1" minOccurs="0" name="D_3036_3">
                                      <xsd:simpleType>
                                        <xsd:restriction base="xsd:string">
                                          <xsd:maxLength value="70"/>
                                          <xsd:minLength value="1"/>
                                        </xsd:restriction>
                                      </xsd:simpleType>
                                    </xsd:element>
                                    <xsd:element maxOccurs="1" minOccurs="0" name="D_3036_4">
                                      <xsd:simpleType>
                                        <xsd:restriction base="xsd:string">
                                          <xsd:maxLength value="70"/>
                                          <xsd:minLength value="1"/>
                                        </xsd:restriction>
                                      </xsd:simpleType>
                                    </xsd:element>
                                    <xsd:element maxOccurs="1" minOccurs="1" name="D_3036_5">
                                      <xsd:simpleType>
                                        <xsd:restriction base="xsd:string">
                                          <xsd:maxLength value="70"/>
                                          <xsd:minLength value="1"/>
                                        </xsd:restriction>
                                      </xsd:simpleType>
                                    </xsd:element>
                                    <xsd:element maxOccurs="1" minOccurs="1" name="D_3045">
                                      <xsd:simpleType>
                                        <xsd:restriction base="CL__BDEW3045">
                                          <xsd:maxLength value="3"/>
                                          <xsd:minLength value="1"/>
                                        </xsd:restriction>
                                      </xsd:simpleType>
                                    </xsd:element>
                                  </xsd:sequence>
                                </xsd:complexType>

xml file for the validation (missing mandatory field D_3036_5 and D_3045):

<C_C080>
          <D_3036>D_30360</D_3036>
          <D_3036_2>D_3036_20</D_3036_2>
          <D_3036_3>D_3036_30</D_3036_3>
</C_C080>


If the above validation message as expected? or it could be changed to show that the only missing data element (and all missing data element). Because additional optional fields show in the result could cause any confusion?

Many thanks!

Best regards, Radar


Replies (8)

Please register to reply

RE: Question on Validation Message of missing data element - Added by Radar Lei over 5 years ago

Sorry that the XSD and XML file does not show in proper view in my post:

XSD: <xsd:element maxOccurs="1" minOccurs="1" name="C_C080"> xsd:complexType xsd:sequence <xsd:element maxOccurs="1" minOccurs="1" name="D_3036"> xsd:simpleType <xsd:restriction base="xsd:string"> <xsd:maxLength value="70"/> <xsd:minLength value="1"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element maxOccurs="1" minOccurs="0" name="D_3036_2"> xsd:simpleType <xsd:restriction base="xsd:string"> <xsd:maxLength value="70"/> <xsd:minLength value="1"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element maxOccurs="1" minOccurs="0" name="D_3036_3"> xsd:simpleType <xsd:restriction base="xsd:string"> <xsd:maxLength value="70"/> <xsd:minLength value="1"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element maxOccurs="1" minOccurs="0" name="D_3036_4"> xsd:simpleType <xsd:restriction base="xsd:string"> <xsd:maxLength value="70"/> <xsd:minLength value="1"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element maxOccurs="1" minOccurs="1" name="D_3036_5"> xsd:simpleType <xsd:restriction base="xsd:string"> <xsd:maxLength value="70"/> <xsd:minLength value="1"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element maxOccurs="1" minOccurs="1" name="D_3045"> xsd:simpleType <xsd:restriction base="CL__BDEW3045"> <xsd:maxLength value="3"/> <xsd:minLength value="1"/> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:complexType> ** XML:**

<C_C080> <D_3036>D_30360</D_3036> <D_3036_2>D_3036_20</D_3036_2> <D_3036_3>D_3036_30</D_3036_3> </C_C080>

RE: Question on Validation Message of missing data element - Added by Radar Lei over 5 years ago

Hi Michael,

This symptom happens in saxon-ee 9.9.1.3 as well.

For the issue you mentioned in https://saxonica.plan.io/issues/4214, it is works in the new version.

But my case is a little bit different. It is some required fields missing after the filled fields, then it potentially show all possible elements till the mandatory data element. My expectation is that it only show like following for above example:

In content of element : The content is incomplete. The following elements would be valid here, all in no namespace: D_3036_5, D_3045.

Because the above two elements defined as mandatory, what do you think?

Thanks and best regards, Radar

RE: Question on Validation Message of missing data element - Added by Michael Kay over 5 years ago

The error message is correct, but we could perhaps be a bit smarter in this case in finding and explaining the cause.

Internally, Saxon keeps a finite state machine, and each element that actually appears in the instance being validated causes transition to a new state. When the sequence ends, but the state is not an acceptable final state, we report that the sequence is incomplete, and we report what elements are permitted to come next. We don't attempt to search the finite state machine for a sequence of elements that would be needed to reach a valid final state. We could, but we don't.

RE: Question on Validation Message of missing data element - Added by Radar Lei over 5 years ago

Hi Michael,

Got it , we'll think about to analyze the permitted elements from the result.

RE: Question on Validation Message of missing data element - Added by Michael Kay over 5 years ago

I took a look at implementing this.

In the case where the required sequence is (X,(P|Q|R),Z) and the actual sequence is (X,Z), we explore the FSM and in the case where the sequence would be valid if any one of P, Q, or R appeared between X and Z, then we say so in the error message. If a longer sequence is required between X and Z, we don't report this.

We could do the same thing, I think, where the required sequence is (X,(P|Q|R),$) and the actual sequence is (X,$), where $ represents end of sequence. In the case where adding any one of P, Q, or R at the end would make the sequence valid, we could report this. I have made this change for the next major release.

However, exploring the FSM and discovering that a sequence of several omitted elements, as suggested for this example, seems a step too far.

RE: Question on Validation Message of missing data element - Added by Radar Lei over 5 years ago

Hi Michael,

I not sure if I understand your reply in correct way. what is the meaning of 'FSM'?

In your example:

sequence of the element is like (X,(P|Q|R),Z) --> P|Q|R means 1 mandatory elements, and in addition Z is the mandatory field but not the last element in the sequence, right? And in case only (X,Z) are filled, then you'll report the error like (In content of element : The content is incomplete. The following elements would be valid here, all in no namespace: P, Q, R --> same as current report

sequence of the element is like (X,(P|Q|R),$) --> P|Q|R means 1 mandatory elements, and in addition $ is the mandatory field and the last element in the sequence, right?

for a new case: sequence of the element is like (X,(P),(Q),(R),$) --> P,Q,R means optional elements, and in addition $ is the mandatory field and the last element in the sequence, now if only 'X' is filled, what message will you like to show?

thanks!

RE: Question on Validation Message of missing data element - Added by Michael Kay over 5 years ago

FSM means "Finite State Machine" (sometimes called FSA for Finite State Automaton). The standard way that validation against a grammar is performed is to turn the grammar into a finite state machine (like a railroad diagram showing what sequences of symbols/elements are permitted), and it's useful to understand this mechanism if you want to appreciate what information is available for diagnostics when the input doesn't conform to the grammar - when the train leaves the tracks, so to speak.

The problem when you hit the end of the input and more input is required is that to tell the user how the input could be validly completed, you need to do a search from the current position in the railroad diagram to find a route through to the end; and there can be an infinite number of possible routes. So the search is potentially rather expensive; certainly a lot more expensive and complex than the simple task of checking whether the input conforms to the grammar.

    (1-8/8)

    Please register to reply