Class DynamicValidator

java.lang.Object
com.illumon.iris.validation.DataQualityTestCase
com.illumon.iris.validation.dynamic.DynamicValidator
All Implemented Interfaces:
DataQualityTestCaseInterface, DynamicValidatorInterface

public class DynamicValidator
extends DataQualityTestCase
implements DynamicValidatorInterface
A user interface driven data validator.
  • Constructor Details

    • DynamicValidator

      public DynamicValidator​(ValidationTableDescription validationTableDescription)
      Create a test case for use in validation.
      Parameters:
      validationTableDescription - description of the table to validate.
  • Method Details

    • assertSize

      public void assertSize​(long min, long max)
      Description copied from interface: DynamicValidatorInterface
      Asserts the number of rows in the table is in the inclusive range [min,max].
      Specified by:
      assertSize in interface DynamicValidatorInterface
      Parameters:
      min - minimum number of table rows
      max - maximum number of table rows
    • assertColumnType

      public void assertColumnType​(String column, Class type)
      Description copied from interface: DynamicValidatorInterface
      Asserts that a column is of the expected type.
      Specified by:
      assertColumnType in interface DynamicValidatorInterface
      Parameters:
      column - column to validate
      type - expected type
    • assertColumnGrouped

      public void assertColumnGrouped​(String column)
      Description copied from interface: DynamicValidatorInterface
      Asserts that a column is grouped.
      Specified by:
      assertColumnGrouped in interface DynamicValidatorInterface
      Parameters:
      column - column to validate
    • assertAllValuesEqual

      public void assertAllValuesEqual​(boolean removeNull, boolean removeNaN, boolean removeInf, String column)
      Description copied from interface: DynamicValidatorInterface
      Asserts that a column only contains a single value.
      Specified by:
      assertAllValuesEqual in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to validate
    • assertAllValuesNotEqual

      public void assertAllValuesNotEqual​(boolean removeNull, boolean removeNaN, boolean removeInf, String column)
      Description copied from interface: DynamicValidatorInterface
      Asserts that a column does not contain repeated values.
      Specified by:
      assertAllValuesNotEqual in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to validate
    • assertAllValuesEqualInputValue

      public void assertAllValuesEqualInputValue​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Object value)
      Description copied from interface: DynamicValidatorInterface
      Asserts that a column only contains a single input value.
      Specified by:
      assertAllValuesEqualInputValue in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to validate
      value - make sure the column only contains this value
    • assertAllValuesNotEqualInputValue

      public void assertAllValuesNotEqualInputValue​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Object value)
      Description copied from interface: DynamicValidatorInterface
      Asserts that a column does not contain the specified input value.
      Specified by:
      assertAllValuesNotEqualInputValue in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to validate
      value - make sure the column does not contain this value
    • assertEqual

      public void assertEqual​(boolean removeNull, boolean removeNaN, boolean removeInf, String column1, String column2)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values in column1 are equal to all values in column2.
      Specified by:
      assertEqual in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column1 - column to test
      column2 - column to test
    • assertNotEqual

      public void assertNotEqual​(boolean removeNull, boolean removeNaN, boolean removeInf, String column1, String column2)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values in column1 are not equal to all values in column2.
      Specified by:
      assertNotEqual in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column1 - column to test
      column2 - column to test
    • assertLess

      public void assertLess​(boolean removeNull, boolean removeNaN, boolean removeInf, String column1, String column2)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values in column1 are less than all values in column2.
      Specified by:
      assertLess in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column1 - column to test
      column2 - column to test
    • assertLessEqual

      public void assertLessEqual​(boolean removeNull, boolean removeNaN, boolean removeInf, String column1, String column2)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values in column1 are less than or equal to all values in column2.
      Specified by:
      assertLessEqual in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column1 - column to test
      column2 - column to test
    • assertGreater

      public void assertGreater​(boolean removeNull, boolean removeNaN, boolean removeInf, String column1, String column2)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values in column1 are greater than all values in column2.
      Specified by:
      assertGreater in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column1 - column to test
      column2 - column to test
    • assertGreaterEqual

      public void assertGreaterEqual​(boolean removeNull, boolean removeNaN, boolean removeInf, String column1, String column2)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values in column1 are greater than or equal to all values in column2.
      Specified by:
      assertGreaterEqual in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column1 - column to test
      column2 - column to test
    • assertNumberDistinctValues

      public void assertNumberDistinctValues​(boolean removeNull, boolean removeNaN, boolean removeInf, String columns, long min, long max)
      Description copied from interface: DynamicValidatorInterface
      Asserts the number of distinct values is in the inclusive range [min,max].
      Specified by:
      assertNumberDistinctValues in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      columns - comma separated list of columns to test
      min - minimum fraction of values remaining after the filter. Between 0 and 1.
      max - maximum fraction of values remaining after the filter. Between 0 and 1.
    • assertAllValuesInDistinctSet

      public void assertAllValuesInDistinctSet​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Object... expectedValues)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values in a column are present in a set of expected values.
      Specified by:
      assertAllValuesInDistinctSet in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      expectedValues - set of expected values
    • assertAllValuesInArrayInDistinctSet

      public void assertAllValuesInArrayInDistinctSet​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Object... expectedValues)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values contained in arrays in a column are present in a set of expected values.
      Specified by:
      assertAllValuesInArrayInDistinctSet in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      expectedValues - set of expected values
    • assertAllValuesInStringSetInDistinctSet

      public void assertAllValuesInStringSetInDistinctSet​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Object... expectedValues)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values contained in string sets in a column are present in a set of expected values.
      Specified by:
      assertAllValuesInStringSetInDistinctSet in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      expectedValues - set of expected values
    • assertAllValuesNotInDistinctSet

      public void assertAllValuesNotInDistinctSet​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Object... values)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values in a column are not present in a set of values.
      Specified by:
      assertAllValuesNotInDistinctSet in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      values - set of values
    • assertAllValuesInArrayNotInDistinctSet

      public void assertAllValuesInArrayNotInDistinctSet​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Object... expectedValues)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values contained in arrays in a column are not present in a set of values.
      Specified by:
      assertAllValuesInArrayNotInDistinctSet in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      expectedValues - set of values
    • assertAllValuesInStringSetNotInDistinctSet

      public void assertAllValuesInStringSetNotInDistinctSet​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Object... values)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values contained in string sets in a column are not present in a set of values.
      Specified by:
      assertAllValuesInStringSetNotInDistinctSet in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      values - set of values
    • assertFracWhere

      public void assertFracWhere​(String filter, double min, double max)
      Description copied from interface: DynamicValidatorInterface
      Asserts the fraction of a table's rows matching the provided filter falls within a defined range.
      Specified by:
      assertFracWhere in interface DynamicValidatorInterface
      Parameters:
      filter - filter
      min - minimum fraction of values remaining after the filter. Between 0 and 1.
      max - maximum fraction of values remaining after the filter. Between 0 and 1.
    • assertFracNull

      public void assertFracNull​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, double min, double max)
      Description copied from interface: DynamicValidatorInterface
      Asserts that the fraction of NULL values is in the inclusive range [min,max].
      Specified by:
      assertFracNull in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      min - minimum fraction of values remaining after the filter. Between 0 and 1.
      max - maximum fraction of values remaining after the filter. Between 0 and 1.
    • assertNotNull

      public void assertNotNull​(String... columns)
      Description copied from interface: DynamicValidatorInterface
      Asserts that no null values exist in the specified columns.
      Specified by:
      assertNotNull in interface DynamicValidatorInterface
    • assertFracNan

      public void assertFracNan​(boolean removeNull, boolean removeInf, String column, double min, double max)
      Description copied from interface: DynamicValidatorInterface
      Asserts that the fraction of NaN values is in the inclusive range [min,max].
      Specified by:
      assertFracNan in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeInf - true to remove rows where column is Inf
      column - column to test
      min - minimum fraction of values remaining after the filter. Between 0 and 1.
      max - maximum fraction of values remaining after the filter. Between 0 and 1.
    • assertFracInf

      public void assertFracInf​(boolean removeNull, boolean removeNaN, String column, double min, double max)
      Description copied from interface: DynamicValidatorInterface
      Asserts that the fraction of infinite values is in the inclusive range [min,max].
      Specified by:
      assertFracInf in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      column - column to test
      min - minimum fraction of values remaining after the filter. Between 0 and 1.
      max - maximum fraction of values remaining after the filter. Between 0 and 1.
    • assertFracZero

      public void assertFracZero​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, double min, double max)
      Description copied from interface: DynamicValidatorInterface
      Asserts that the fraction of zero values is in the inclusive range [min,max].
      Specified by:
      assertFracZero in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      min - minimum fraction of values remaining after the filter. Between 0 and 1.
      max - maximum fraction of values remaining after the filter. Between 0 and 1.
    • assertFracValuesBetween

      public void assertFracValuesBetween​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Comparable minValue, Comparable maxValue, double min, double max)
      Description copied from interface: DynamicValidatorInterface
      Asserts that the fraction of values between [minValue,maxValue] is in the inclusive range [min,max].
      Specified by:
      assertFracValuesBetween in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      minValue - minimum value for the value range
      maxValue - maximum value for the value range
      min - minimum fraction of values remaining after the filter. Between 0 and 1.
      max - maximum fraction of values remaining after the filter. Between 0 and 1.
    • assertAllValuesBetween

      public void assertAllValuesBetween​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Comparable minValue, Comparable maxValue)
      Description copied from interface: DynamicValidatorInterface
      Asserts that all values in the column are between [minValue,maxValue]
      Specified by:
      assertAllValuesBetween in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      minValue - minimum value for the value range
      maxValue - maximum value for the value range
    • assertMin

      public void assertMin​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Comparable min, Comparable max, String... groupByColumns)
      Description copied from interface: DynamicValidatorInterface
      Asserts that the minimum value of the column is in the inclusive range [min,max].
      Specified by:
      assertMin in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      min - minimum value for the value range
      max - maximum value for the value range
      groupByColumns - columns delineating groups for testing monotonicity
    • assertMax

      public void assertMax​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Comparable min, Comparable max, String... groupByColumns)
      Description copied from interface: DynamicValidatorInterface
      Asserts that the maximum value of the column is in the inclusive range [min,max].
      Specified by:
      assertMax in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      min - minimum value for the value range
      max - maximum value for the value range
      groupByColumns - columns delineating groups for testing monotonicity
    • assertAvg

      public void assertAvg​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, double min, double max, String... groupByColumns)
      Description copied from interface: DynamicValidatorInterface
      Asserts that the average of the column is in the inclusive range [min,max].
      Specified by:
      assertAvg in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      min - minimum value for the value range
      max - maximum value for the value range
      groupByColumns - columns delineating groups for testing monotonicity
    • assertStd

      public void assertStd​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, double min, double max, String... groupByColumns)
      Description copied from interface: DynamicValidatorInterface
      Asserts that the standard deviation of the column is in the inclusive range [min,max].
      Specified by:
      assertStd in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      min - minimum value for the value range
      max - maximum value for the value range
      groupByColumns - columns delineating groups for testing monotonicity
    • assertPercentile

      public void assertPercentile​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, double percentile, double min, double max, String... groupByColumns)
      Description copied from interface: DynamicValidatorInterface
      Asserts that the defined percentile of the column is in the inclusive range [min,max].
      Specified by:
      assertPercentile in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      percentile - percentile of the column to test. Between 0 and 1.
      min - minimum value for the value range
      max - maximum value for the value range
      groupByColumns - columns delineating groups for testing monotonicity
    • assertAscending

      public void assertAscending​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, String... groupByColumns)
      Description copied from interface: DynamicValidatorInterface
      Asserts that sub-groups of a column have monotonically increasing values. Consecutive values within a group must be equal or increasing.
      Specified by:
      assertAscending in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      groupByColumns - columns delineating groups for testing monotonicity
    • assertStrictlyAscending

      public void assertStrictlyAscending​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, String... groupByColumns)
      Description copied from interface: DynamicValidatorInterface
      Asserts that sub-groups of a column have monotonically strictly increasing values. Consecutive values within a group must be increasing.
      Specified by:
      assertStrictlyAscending in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      groupByColumns - columns delineating groups for testing monotonicity
    • assertDescending

      public void assertDescending​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, String... groupByColumns)
      Description copied from interface: DynamicValidatorInterface
      Asserts that sub-groups of a column have monotonically decreasing values. Consecutive values within a group must be equal or decreasing.
      Specified by:
      assertDescending in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      groupByColumns - columns delineating groups for testing monotonicity
    • assertStrictlyDescending

      public void assertStrictlyDescending​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, String... groupByColumns)
      Description copied from interface: DynamicValidatorInterface
      Asserts that sub-groups of a column have monotonically strictly decreasing values. Consecutive values within a group must be decreasing.
      Specified by:
      assertStrictlyDescending in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to test
      groupByColumns - columns delineating groups for testing monotonicity
    • assertCountEqual

      public void assertCountEqual​(boolean removeNull, boolean removeNaN, boolean removeInf, String column, Object value1, Object value2)
      Description copied from interface: DynamicValidatorInterface
      Asserts that a column contains the same number of rows for two given values.
      Specified by:
      assertCountEqual in interface DynamicValidatorInterface
      Parameters:
      removeNull - true to remove rows where column is NULL
      removeNaN - true to remove rows where column is NaN
      removeInf - true to remove rows where column is Inf
      column - column to validate
      value1 - make sure the column has the same number of value1 and value2 entries
      value2 - make sure the column has the same number of value1 and value2 entries
    • assertColumnTypes

      public void assertColumnTypes()
      Description copied from interface: DynamicValidatorInterface
      Asserts that the column types in the table match the column types in the schema.
      Specified by:
      assertColumnTypes in interface DynamicValidatorInterface