Skip to content

Latest commit

 

History

History
198 lines (153 loc) · 7.73 KB

SetRestrictionAndGeneration.md

File metadata and controls

198 lines (153 loc) · 7.73 KB

Set restriction process

The profile describes to the generator how to reduce the original set of data to a permitted set of data per field. In most cases the generator starts off using the universal set as the source of data. This represents all values for all types (datetime, string, integer, decimal) without restriction. It also allows for no value to be emitted - the empty set (∅) commonly expressed as null).

The universal set can be visualised as

The generator will only (effectively) use a different original set of data - i.e. not use the universal set - if one of the following constraints are used:

  • equalTo - uses the given value and ∅* as the set
  • inSet - uses the given values and ∅* as the entire set
  • null - uses the empty set (∅) as the set of permitted values

* Unless not(field is null) is supplied as a constraint to the field.

The above constraints describe the set of permitted values to then (potentially) filter values further (if there are more constraints).

You can imagine the universal set is divided into a number of quadrants, where each constraint only applies a filter to part of the universal set (or what remains of it). i.e.

  • greaterThan will only affect the integer/decimal values in the universal set, other values will remain un-touched
  • shorterThan will only affect the string values in the universal set, other values will remain un-touched
  • ofType will remove all values from the universal set other than those of the prescribed type (see graphical representation)
  • not null will remove the empty set (∅) from the universal set (see graphical representation)
  • null will remove everything except for the empty set (∅) from the universal set (see graphical representation)
  • inSet removes any value from the universal set that is not in the prescribed set (set intersection) (except ∅ which remains unless not null is used as well) (see graphical representation)

Examples:

Numeric restrictions

foo greaterThan 2
and
foo lessThan 5

effectively:

  1. Uses the universal set as the source of all data
  2. Where a value is a number, ensures that it is greater than 2 AND that it is less than 5 (i.e. 2.00 recurring - 1...4.99 recurring) (yielding the set [{numbers >2..<5}, {all datetimes}, {all strings}, ∅])

This reduced set of values is then the permitted set of values for field foo.

It does not

  • Restrict any other type of value, therefore other types of values are still permitted (string, datetime)
  • Prevent the empty set from being emitted

You might expect the following data to be emitted

foo
3
4
"some string"
2010-01-01T01:01:01.000

InSet restrictions

foo inSet [a, b, c]
and
foo inSet [c, d, e]

effectively:

  1. Intersects the set [a, b, c] with the universal set (yielding the set [a, b, c, ∅])
  2. Intersects [a, b, c] with [c, d, e] (yielding the set [c, ∅])

This set of values is then the permitted set of values for field foo.

It does not

  • Prevent the empty set from being emitted

As the original set is a restricted set, only the values provided can be emitted. These values can be of heterogeneous types (e.g. an intermix of datetime, string, decimal and integer types).

You might expect the following data to be emitted

foo
c

Removal of the empty set (null)

foo inSet [a, b, c]
and
foo not(is null)

effectively:

  1. Intersects the set [a, b, c] with the universal set (yielding the set [a, b, c, ∅])
  2. Removes* the empty set (yielding the set [a, b, c])

* In practice the set is not removed, an flag is set to instruct the generator to NOT emit the empty set (null) value

This set of values is then the permitted set of values for field foo.

You might expect the following data to be emitted

foo
a
b
c

Contradicting sets

foo inSet [a, b, c]
and
foo equalTo [d]

effectively:

  1. Intersects the set [a, b, c] with the universal set (yielding the set [a, b, c, ∅])
  2. Intersects [a, b, c] with [d] (yielding the set [∅])

This set of values is then the permitted set of values for field foo.

You might expect the following data to be emitted

foo

Conditionals (anyOf, allOf, if)

foo inSet [a, b, c, x, y, z]
if foo inSet [a, b, c]
then
bar equalTo [d]
else
bar equalTo [e]

Note that if the else is not supplied, it will be inferred as not(bar equalTo [d]), see the not section lower down.

effectively:

  1. Intersects the set [a, b, c, x, y, z] with the universal set (yielding the set [a, b, c, x, y, z, ∅])
  2. Splits the set into 2 discrete sets ([a, b, c, ∅] and [x, y, z, ∅]) for field foo
  3. For the first set of data for foo ([a, b, c, ∅]) intersect the set [d] with the universal set (yielding the set [d, ∅]) for field bar
  4. Depending on the combination strategy, repeat each item in the set [d, ∅] with each item in the set [a, b, c, ∅] (the cartesian product)
  5. Repeat the process for the second set where foo is has the set [x, y, z, ∅]

This set of values is then the permitted set of values for field foo.

You might expect the following data to be emitted

foo bar
e
x
y
z
x e
y e
z e
a
b
c
a d
b d
c d

Contradicting conditionals (mistakes)

bar not null
bar inSet [x, y, z]
foo inSet [a, b, c]
if foo inSet [a, b, c]
then
bar is null

The else segment has been excluded for clarity, it would be included as not(foo inSet [a, b, c]) and processed as described in the section above.

effectively (in the ):

  1. Remove the set [∅] from the universal set and use this is the set for bar going forwards
  2. Intersects the set [a, b, c] with the universal set (yielding the set [a, b, c, ∅])
  3. For the first set of data for foo ([a, b, c, ∅]) intersect the set [∅] with the set of data for bar ([{all datetimes}, {all numeric values}, {all string values}]) this produces an empty set of data (where event the ∅ is not present, i.e. [])

This results in no data being created given the scenario where foo has a value in the set [a, b, c]. The field foo is not restricted from being null therefore it is theoretically permitted for the generator to enter the then when foo is null. This doesn't happen currently as when foo is null it is ambiguous between the then and the else.

You might expect the following data to be emitted (where foo is not in the set [a, b, c])

foo bar
x
y
z

Negation of constraints

The not constraint inverts the operation of the constraint, so the following can be observed:

constraint effectively resulting in
foo is null intersects the universal set with [∅] [∅]
not(foo is null) removes [∅] from the universal set [{all values except: ∅}]
foo inSet [a, b, c] intersects the universal set with [a, b, c]* [a, b, c, ∅]
not(foo inSet [a, b, c]) removes [a, b, c] from the universal set [{all values except: a, b or c}]

* Note, the intersection retains the applicability of the ∅ being emitted. The only way to remove the ∅ from the set of permitted values is to use the not (foo is null) constraint.