Types of spatial data

Notation

(Classification discussed here according to Cressie (1993))

Consider a spatial process in \(d\) dimensions: \(\{Z(s): s \in D \subset \mathbb{R}^d\}\). \(Z\) is the attribute we observe at spatial location \(s\) (a vector if \(d\) co-ordinates). Most often \(d=2\) - Cartesian co-ordinates.

1. Geospatial Data

Data measured only at certain locations - cannot be sampled exhaustively. Can we construct a surface of \(Z\) over the entire domain?

1. Geospatial Data

The R package spatstat has built in data sets, see https://cran.r-project.org/web/packages/spatstat/vignettes/datasets.pdf

1. Geospatial Data

a <- as.ppp(longleaf)
a1 <- as.data.frame(longleaf)
plot(a)

1. Geospatial Data

aa <- as.ppp(chorley)
plot(aa)

1. Geospatial Data

aa$window$bdry
## [[1]]
## [[1]]$x
##   [1] 359.46 359.41 359.56 360.78 361.35 362.31 363.30 363.91 364.71 366.45
##  [11] 365.80 365.61 365.76 364.99 365.40 365.75 365.90 366.04 365.36 365.05
##  [21] 365.20 364.48 364.21 364.47 364.32 363.93 363.97 363.67 363.32 363.28
##  [31] 362.90 362.71 362.93 363.43 363.43 363.69 363.50 363.01 362.74 362.77
##  [41] 363.69 364.22 364.03 363.45 362.54 361.70 361.35 360.86 360.40 359.76
##  [51] 358.38 358.19 358.88 359.00 358.69 358.27 357.55 357.47 357.13 356.56
##  [61] 356.22 355.73 355.23 354.77 354.05 353.13 352.83 352.60 351.99 350.04
##  [71] 347.83 346.16 343.45 344.45 344.30 345.33 345.52 345.83 345.68 345.26
##  [81] 345.23 345.69 345.99 345.65 346.03 346.07 345.66 345.96 345.77 346.27
##  [91] 346.42 346.16 346.46 346.24 346.66 346.54 346.81 346.55 346.86 347.32
## [101] 347.17 347.66 347.93 347.51 348.38 348.35 349.30 349.84 350.94 351.36
## [111] 352.85 353.38 353.72 354.06 354.06 354.25 354.37 353.88 354.03 354.34
## [121] 354.72 355.67 356.13 357.08 358.15 358.42 358.23 358.42 358.61 358.39
## [131] 358.73
## 
## [[1]]$y
##   [1] 410.57 411.15 411.99 412.80 412.57 410.90 411.51 412.51 412.93 414.59
##  [11] 415.25 416.21 416.91 419.08 420.12 420.16 420.50 422.01 422.01 422.29
##  [21] 422.71 423.02 423.99 424.41 425.03 425.31 425.58 425.77 425.77 426.70
##  [31] 426.97 427.48 427.90 428.28 428.63 429.09 429.21 428.98 429.14 429.56
##  [41] 430.22 430.72 431.10 431.14 431.23 431.15 431.74 431.39 431.74 431.36
##  [51] 431.79 431.56 431.01 430.55 430.24 430.01 430.02 429.55 429.63 428.71
##  [61] 428.21 428.35 428.64 428.21 428.57 428.11 428.34 428.96 429.16 429.05
##  [71] 428.48 427.91 427.00 426.92 426.38 425.40 424.90 423.28 423.12 423.12
##  [81] 422.55 421.69 421.57 421.07 420.76 420.03 419.61 419.26 418.99 418.75
##  [91] 418.21 418.02 417.82 417.55 416.86 416.66 416.35 415.04 414.96 414.22
## [101] 413.84 413.45 413.72 414.49 414.49 414.10 413.98 413.52 414.40 414.40
## [111] 413.77 413.77 414.31 414.38 414.69 414.57 413.84 413.42 412.37 412.10
## [121] 412.02 412.17 412.09 412.39 412.54 412.31 412.04 411.81 410.99 410.49
## [131] 410.41
plot(aa$window)

2. Lattice Data (or Regional Data)

For example (1) data collected at the ward level, (2) remotely sensed data reported at the pixel level. This is spatially aggregated data thus also called Regional Data. The data is usually exhaustively observed.

2. Lattice Data (or Regional Data)

The term used is sites instead of points to refer to the spatial location of lattice data. Usually a polygon (ward boundary) with some representative location as the centroid (for example).

Notation: \(Z(A_i)\)

  1. \(A_i\)’s in close proximity may have similar values (positive spatial autocorrelation)
  2. Identify region clustering of ‘high’ values
  3. Where are the spatial risks? (Correlate with another covariate.)

Measures on Lattices

Need a measure of spatial connectivity.How is distace between ‘representative points’ determined? For each pair \(s_i\) and \(s_j\) associate a weight \(w_{ij}\) for sites considered spatially connected \[w_{ij} = \left\{\begin{eqnarray} 1 & \textrm{ if connected} \\ 0 & \textrm{ otherwise.} \end{eqnarray}\right.\]

For pixels: 4-connectivity, diagonal neighbours, 8-connectivity

Measures on Lattices

For polygons: if points share a common border or if representative points are less than a certain critical distance apart

knitr::include_graphics('scale1to10.jpg')

Measures on Lattices

heather
## List of spatial objects
## 
## fine:
## window: binary image mask
## 1570 x 778 pixel array (ny, nx)
## enclosing rectangle: [0, 9.88] x [0, 19.94] metres
## 
## medium:
## window: binary image mask
## 512 x 256 pixel array (ny, nx)
## enclosing rectangle: [0, 10] x [0, 20] metres
## 
## coarse:
## window: binary image mask
## 200 x 100 pixel array (ny, nx)
## enclosing rectangle: [0, 10] x [0, 20] metres
data(heather)

Measures on Lattices

plot(heather)

Measures on Lattices

plot(heather$fine)

Measures on Lattices

summary(heather$fine)
## binary image mask
## 1570 x 778 pixel array (ny, nx)
## pixel size: 0.0127 by 0.0127 metres
## enclosing rectangle: [0, 9.88] x [0, 19.94] metres
## Window area = 97.0189 square metres
## Unit of length: 1 metre
## Fraction of frame area: 0.492

Measures on Lattices

Can <- st_read("scottish_lip_cancer.shp")
## Reading layer `scottish_lip_cancer' from data source 
##   `G:\My Drive\My Documents\Work\Conferences and Seminars\Courses\3MC 2024\Inger's Notes\scottish_lip_cancer.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 56 features and 11 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -8.621389 ymin: 54.62722 xmax: -0.7530556 ymax: 60.84444
## Geodetic CRS:  GCS_Assumed_Geographic_1
plot(Can)
## Warning: plotting the first 10 out of 11 attributes; use max.plot = 11 to plot
## all

Measures on Lattices

summary(Can)
##      NAME                 ID            COUNT             SMR       
##  Length:56          Min.   : 1.00   Min.   : 0.000   Min.   :  0.0  
##  Class :character   1st Qu.:14.75   1st Qu.: 4.750   1st Qu.: 49.6  
##  Mode  :character   Median :28.50   Median : 8.000   Median :111.5  
##                     Mean   :28.50   Mean   : 9.571   Mean   :152.6  
##                     3rd Qu.:42.25   3rd Qu.:11.000   3rd Qu.:223.0  
##                     Max.   :56.00   Max.   :39.000   Max.   :652.2  
##       LONG            LAT              PY               EXP_       
##  Min.   :54.94   Min.   :1.430   Min.   :  27075   Min.   : 1.100  
##  1st Qu.:55.78   1st Qu.:3.288   1st Qu.: 100559   1st Qu.: 4.050  
##  Median :56.04   Median :4.090   Median : 182333   Median : 6.300  
##  Mean   :56.40   Mean   :4.012   Mean   : 267498   Mean   : 9.575  
##  3rd Qu.:57.02   3rd Qu.:4.730   3rd Qu.: 313845   3rd Qu.:10.125  
##  Max.   :60.24   Max.   :6.800   Max.   :2316353   Max.   :88.700  
##       AFF             X_COOR           Y_COOR                 geometry 
##  Min.   : 0.000   Min.   :112892   Min.   : 561163   MULTIPOLYGON :56  
##  1st Qu.: 1.000   1st Qu.:256624   1st Qu.: 649520   epsg:NA      : 0  
##  Median : 7.000   Median :287577   Median : 681524   +proj=long...: 0  
##  Mean   : 8.661   Mean   :288524   Mean   : 723127                     
##  3rd Qu.:11.500   3rd Qu.:333949   3rd Qu.: 794380                     
##  Max.   :24.000   Max.   :442245   Max.   :1168904

Measures on Lattices

Can$COUNT
##  [1]  5  3  9 39  2  9 16  6 17 19 15 16  2  6  4  1  8  1  8  1  9  3 10  1 28
## [26]  8  6 11 19  3  2 11 10  9  7  7  5  7  0  7  0  8  3  7 20 31  8  7  6 11
## [51] 15 13  9 26 11 11

Measures on Lattices

Can$SMR
##  [1] 279.3 277.8 162.7 450.3 186.9 197.8 153.0  30.6 216.8 122.8 120.1 111.3
## [13]  46.3  83.3  75.9  27.6  50.7  29.1  93.8  14.2  89.1  36.6  53.3  17.4
## [25]  31.6  85.6  41.0 107.8  37.5  32.1  35.8  89.3 111.6 355.7 157.7  99.6
## [37] 105.3 124.6   0.0 167.5   0.0 241.7 104.2 115.9 301.7 136.7 333.3 304.3
## [49] 303.0 361.8 352.1 295.5 652.2 320.6 125.4  86.8

Measures on Lattices

Can$PY
##  [1]   37521   29374  162867  231337   27075  111665  263205  547016  185472
## [10]  432132  378946  346041  141294  231227  156924  110707  426519  179194
## [19]  233125  246744  296238  238170  617413  146112 2316353  319072  449231
## [28]  382702 1287561  312103  246849  319316  231185   51710  102697  249667
## [37]  139148  163818   38704   94145  103412   86444   65448  163703  165554
## [46]  583327   53199   62603   59183   83190  129271   87815   28324  245513
## [55]  190816  391513

Measures on Lattices

count <- data.frame(Can$X_COOR, Can$Y_COOR, Can$COUNT)
w <- owin(xrange = c(min(Can$X_COOR), max(Can$X_COOR)), yrange = c(min(Can$Y_COOR), max(Can$Y_COOR)))
plot(w)

countpp <- as.ppp(count,w)
plot(countpp)

Measures on Lattices

plot(Can)
## Warning: plotting the first 10 out of 11 attributes; use max.plot = 11 to plot
## all

3. Point Patterns (Unmarked patterns)

A point pattern is a collection of points \(I(s), s \in D^*\). The random domain \(D^*\) is obtained as the locations in fixed \(D\) for which \(I(s) =1\) (locations where \(I(s)=0)\) removed from \(D\)). Each realisation of the point process produces a \(D^*\). The indicator function could be something like the following, but the focus is on \(D^*\) more than on \(I(\cdot)\): \[I(s) = \left\{\begin{eqnarray} 1 & \textrm{ if } Z(s) \ge c \\ 0 & \textrm{ otherwise.} \end{eqnarray} \right.\]

Point patterns are effectively unmarked spatial data.

3. Point Patterns (Unmarked patterns)

We ask: Are the points random or is there a spatial pattern?

data(murchison)
murchison
## List of spatial objects
## 
## gold:
## Planar point pattern: 255 points
## window: rectangle = [352782.9, 682589.6] x [6699742, 7101484] metres
## 
## faults:
## planar line segment pattern: 3252 line segments
## window: rectangle = [352782.9, 682589.6] x [6699742, 7101484] metres
## 
## greenstone:
## window: polygonal boundary
## enclosing rectangle: [352782.9, 681699.6] x [6706467, 7100804] metres
plot(murchison$gold)

plot(murchison$faults)

plot(murchison$greenstone)

# 3. Point Patterns (Unmarked patterns)

mpp <- as.ppp(murchison$gold)
mpp
## Planar point pattern: 255 points
## window: rectangle = [352782.9, 682589.6] x [6699742, 7101484] metres
plot(mpp)

3. Point Patterns (Unmarked patterns)

lpp <- as.psp(murchison$faults)
lpp
## planar line segment pattern: 3252 line segments
## window: rectangle = [352782.9, 682589.6] x [6699742, 7101484] metres
plot(lpp)

3. Point Patterns (Unmarked patterns)

ppp <- as.polygonal(murchison$greenstone)
ppp
## window: polygonal boundary
## enclosing rectangle: [352782.9, 681699.6] x [6706467, 7100804] metres
plot(ppp)