Statistics

Note

In this chapter we use ref and sec abbreviations when refering to the reference input DEM (input_ref) and the secondary input DEM (ìnput_sec) respectively.

Demcompare can compute a wide variety of statistics on either an input DEM, or the difference between two input DEMs. The statistics module can consider different number of inputs:

"output_dir": "./test_output/",
"input_ref": {
    "path": "./Gironde.tif",
    "nodata": -9999.0,
},
"statistics": {
    "remove_outliers": "True",
}

If one single DEM is specified in the configuration, the input or default metrics will be directly computed on the input DEM.

../_images/stats_input_one_dem.png

Fig. 5 Statistics computation with one input DEM.

By default, the following metrics will be computed: mean, median, max, min, sum, squared_sum, std.

The user may specify the required metrics as follows:

"output_dir": "./test_output/",
"input_ref": {
    "path": "./Gironde.tif",
    "nodata": -9999.0,
},
"statistics": {
    "remove_outliers": "True",
    "metrics": ["mean", {"ratio_above_threshold": {"elevation_threshold": [1, 2, 3]}}]
}

With the coregistration step

If both coregistration and statistics steps are present on the input configuration:

  • In order to evaluate the coregistration effect, the differences between the reprojected DEMs before and after coregistration, named initial_dem_diff and final_dem_diff, will be considered to compute the Probability Density Function and the Cumulative Density Function.

  • The difference between the reprojected DEMs after coregistration (the final_dem_diff) will be considered to compute the input or default metrics.

"output_dir": "./test_output/",
"input_ref": {
    "path": "./Gironde.tif",
    "nodata": -9999.0,
},
"input_sec": {
    "path": "./FinalWaveBathymetry_T30TXR_20200622T105631_D_MSL_invert.TIF",
    "nodata": -32768,
},
"coregistration": {
    "coregistration_method": "nuth_kaab_internal",
}
"statistics": {
    "remove_outliers": "True",
}
../_images/stats_input_after_coreg.png

Fig. 7 Statistics computation after the coregistration step.

The following metrics will be computed:

On initial_dem_diff and on final_dem_diff: cdf, pdf.

Note

No classification is considered for the metrics to evaluate the coregistration effect. If classification layers are specified on the input configuration, those will be only be considered for the ‘’Other default metrics’’ computation.

The user may specify the required metrics as follows :

"output_dir": "./test_output/",
"input_ref": {
    "path": "./Gironde.tif",
    "nodata": -9999.0,
},
"input_sec": {
    "path": "./FinalWaveBathymetry_T30TXR_20200622T105631_D_MSL_invert.TIF",
    "nodata": -32768,
},
"coregistration": {
    "coregistration_method": "nuth_kaab_internal",
}
"statistics": {
    "remove_outliers": "True",
    "metrics": ["mean", {"ratio_above_threshold": {"elevation_threshold": [1, 2, 3]}}]
}

Metrics

The following metrics are currently available on demcompare:

  • mean

  • max

  • min

  • std (Standard Deviation)

  • rmse (Root Mean Squared Error)

  • median

  • nmad (Normalized Median Absolute Deviation) = 1.486*median(\lvert data - median(data)\rvert)

  • sum

  • squared_sum

  • percentil_90

Note

The metrics are always computed on valid pixels. Valid pixels are those whose value is different than NaN and the nodata value (-32768 by default if not specified in the input configuration or in the input DEM).

Note

Apart from only considering the valid pixels, the user may also specify the remove_outliers option in the input configuration. This option will also filter all DEM pixels outside (mu + 3 sigma) and (mu - 3 sigma), being mu the mean and sigma the standard deviation of all valid pixels in the DEM.

Classification layers

Classification layers are a way to classify the DEM pixels in classes according to different criteria in order to compute specific statistics according to each class.

Four types of classification layers exist:

The global classification is the default classification and is always computed. This layer has a single class where all valid pixels are considered. If no classification layers are specified in the input configuration, only the global classification will be considered.

The modes

As shown in previous section, demcompare will classify stats according to classification layers and classification layer masks must be superimposable to one DEM, meaning that the classification mask and its support DEM must have the same size and resolution.

Whenever a classification layer is given for both DEMs (say one has two DEMs with associated segmentation maps) then it can be possible to observe the metrics for pixels whose classification (segmentation for example) is the same between both DEM or not. These observations are available through what we call mode. Demcompare supports:

Within this mode all valid pixels are considered. It means nan values but also outliers (if remove_outliers was set to "True") and masked ones are discarded.

Note that the nan values can be originated from the altitude differences image and / or the exogenous classification layers themselves (ie. if the input segmentation has NaN values, the corresponding pixels will not be considered for the statistics computation of this classification layer).

In the following schema we can see a scenario where two different segmentation layers and a slope layer are created. Both segmentation layers having a single support and the slope layer having two supports.

  • Segmentation_0 has only ref support, hence the statistics are computed considering the ref segmentation_0_mask.

  • Segmentation_1 has only sec support, hence the statistics are computed considering the sec segmentation_1_mask.

  • Slope_0 has both ref and support, hence the statistics are computed considering:

    • the ref slope_0_mask for the standard mode

    • the intersection between the ref slope_0_mask and the sec slope_0_mask for the intersection and exclusion modes.

../_images/stats_support_schema.png

Fig. 9 Statistics schema with intersection and exclusion modes.

Metric selection

The metrics to be computed may be specified at different levels on the statistics configuration:

  • Global level: those metrics will be computed for all classification layers

  • Classification layer level: those metrics will be computed specifically for the given classification layer

For instance, with the following configuration we could compute the mean, ratio_above_threshold metrics on all layers, whilst nmad metric would be computed only for the Slope0 layer.

"statistics": {
  "classification_layers": {
      "Status": {
          "type": "segmentation",
          "classes": {
              "valid": [0],
              "KO": [1],
              "Land": [2],
              "NoData": [3],
              "Outside_detector": [4],
          },
      },
      "Slope0": {
          "type": "slope",
          "ranges": [0, 10, 25, 50, 90],
          "metrics": ["nmad"],
      },
      "Fusion0": {
          "type": "fusion",
          "sec": ["Slope0", "Status"]
      },
  },
  "metrics": [
      "mean",
      {"ratio_above_threshold": {"elevation_threshold": [1, 2, 3]}},
  ],
 }

Statistics parameters

Here is the list of the parameters of the input configuration file for the statistics step and its associated default value when it exists:

Name

Description

Type

Default value

Required

remove_outliers

Remove outliers during statistics
computation

string

"False"

No

metrics

Metrics to be computed

List

List of default metrics

No

Name

Description

Type

Default value

Required

type

Classification layer type

string

None

Yes

remove_outliers

Remove outliers during statistics computation
for this particular classification layer

string

Value set for the whole stats

No

nodata

Classification layer no data value

float or int

-32768

No

metrics

Classification layer metrics to be computed
(if metrics have been specified for the whole
stats, they will also be computed for this
classification)

List

List of default metrics

No

Name

Description

Type

Default value

Required

'classes'

Segmentation classes

Dict

None

Yes

Statistics outputs

Output files and their required parameters

The images and files saved with the statistics option activated on the configuration :

Name

Description

dem_for_stats.tif

DEM on which the statistics have been computed

ref and sec_rectified_support_map.tif

Stored on each classification layer folder, the rectified support maps
where each pixel has a class value.

stats_results.csv and .json

Stored on each classification layer folder,
the CSV and Json files storing the computed statistics by class.

stats_results_intersection.csv and .json

Stored on each classification layer folder, the CSV and Json files
storing the computed statistics by class in mode intersection.

stats_results_exclusion.csv and .json

Stored on each classification layer folder, the CSV and Json files
storing the computed statistics by class in mode exclusion.

Output directories

With the command line execution, the following statistics directories that may store the respective files will be automatically generated.

.output_dir
+-- stats
    +-- dem_for_stats.tif
    +-- *classification_layer_name*
        +-- stats_results.json/csv
        +-- stats_results_intersection.json/csv
        +-- stats_results_exclusion.json/csv
        +-- ref_rectified_support_map.tif
        +-- sec_rectified_support_map.tif

Note

Please notice that even if no classification layer has been specified, the results will be stored in a folder called global, as it is the classification layer that is always computed and only considers all valid pixels.

Note

Please notice that some data may be missing if it has not been computed for the classification layer (ie. intersection maps are only computed under certain conditions The modes).