Into the palaeoverse
A community-driven R package
The long and the short of it 📏…
Palaeoverse is a project that aims to bring the palaeobiology community together.
palaeoverse provides auxiliary functions to support data preparation and exploration.
Improve code readability, reusability and reproducibility.
A whistle-stop tour of palaeoverse 🚋…
axis_geobin_latbin_timedatagroup_applylat_binslook_uppalaeorotatephylo_checktax_checktax_expand_lattax_expand_timetax_range_spacetax_range_timetax_uniquetime_binsA lot of data, a lot of sources, and a lot of unique features.
Data structure, not source.
occdf \(\rightarrow\) function(x) \(\rightarrow\) df
Occurrence dataframe*
Let’s dive in 🤿…
The development version can be installed using devtools:
Two example occurrence datasets are available.
Carboniferous–Early Triassic tetrapods (n = 5270, Paleobiology Database).
# Get details on dataset
?tetrapods
# Load dataset
data("tetrapods")
# Available variables
colnames(tetrapods)
## [1] "occurrence_no" "collection_no" "identified_name"
## [4] "identified_rank" "accepted_name" "accepted_rank"
## [7] "early_interval" "late_interval" "max_ma"
## [10] "min_ma" "phylum" "class"
## [13] "order" "family" "genus"
## [16] "abund_value" "abund_unit" "lng"
## [19] "lat" "collection_name" "cc"
## [22] "formation" "stratgroup" "member"
## [25] "zone" "lithology1" "environment"
## [28] "pres_mode" "taxon_environment" "motility"
## [31] "life_habit" "diet"Phanerozoic reef occurrences (n = 4363, PaleoReefs Database).
Two reference datasets are available.
Geological Time Scale 2012 & 2020 (Gradstein et al. 2012; 2020).
# Get details on dataset
?GTS2012
?GTS2020
# Load dataset
data("GTS2012")
data("GTS2020")
# Increase output width
options(width = 120)
# Print first few rows
head(GTS2012, n = 3)
## interval_number interval_name rank max_ma mid_ma min_ma duration_myr font colour abbr
## 1 1 Holocene stage 0.0117 0.0059 0.0000 0.0117 black #FDEDEC <NA>
## 2 2 Upper Pleistocene stage 0.1260 0.0688 0.0117 0.1143 black #FFF2D3 <NA>
## 3 3 Middle Pleistocene stage 0.7810 0.4535 0.1260 0.6550 black #FFF2C7 <NA>
head(GTS2020, n = 3)
## interval_number interval_name rank max_ma mid_ma min_ma duration_myr font colour abbr
## 1 1 Meghalayan stage 0.0042 0.00210 0.0000 0.0042 black #FDEDEC <NA>
## 2 2 Northgrippian stage 0.0082 0.00620 0.0042 0.0040 black #FDECE4 <NA>
## 3 3 Greenlandian stage 0.0117 0.00995 0.0082 0.0035 black #FEECDB <NA># Get first few rows
head(bins, n = 3)
## bin interval_name rank max_ma mid_ma min_ma duration_myr abbr colour font
## 1 1 Puercan North American Land Mammal Ages 66.00 65.375 64.75 1.25 P #FDB469 black
## 2 2 Torrejonian North American Land Mammal Ages 64.75 63.500 62.25 2.50 To #FEBA64 black
## 3 3 Tiffanian North American Land Mammal Ages 62.25 59.875 57.50 4.75 Ti #FEBF6A black# Get first few rows
head(bins, n = 3)
## bin max_ma mid_ma min_ma duration_myr grouping_rank intervals colour font
## 1 1 541 535.00 529.0 12.0 stage Fortunian #80cdc1 black
## 2 2 529 521.50 514.0 15.0 stage Stage 3, Stage 2 #80cdc1 black
## 3 3 514 507.25 500.5 13.5 stage Drumian, Wuliuan, Stage 4 #80cdc1 blackFive temporal binning methods for age range data:
# Use tetrapod example data
occdf <- tetrapods
# Get stage-level time bins
bins <- time_bins(interval = "Phanerozoic", rank = "stage")
# Assign via midpoint age of fossil occurrence data
ex1 <- bin_time(occdf = occdf, bins = bins, method = "mid")
# Assign to all bins that age range covers
ex2 <- bin_time(occdf = occdf, bins = bins, method = "all")
# Assign via majority overlap based on fossil occurrence age range
ex3 <- bin_time(occdf = occdf, bins = bins, method = "majority")
# Randomly assign to overlapping bins based on fossil occurrence age range
ex4 <- bin_time(occdf = occdf, bins = bins, method = "random", reps = 10)
# Randomly assign point estimates (e.g. uniform distribution) based on fossil occurrence age range
ex5 <- bin_time(occdf = occdf, bins = bins, method = "point", reps = 10)Generate and bin latitudinal data:
Generate and bin spatial data:
# Get reef data
occdf <- reefs[1:500, ]
# Bin data using a hexagonal equal-area grid
occdf <- bin_space(occdf = occdf, spacing = 250, return = TRUE)
# Plot world and grid using ggplot2
library(ggplot2)
library(rnaturalearth)
world <- ne_countries(scale = "small",returnclass = "sf")
ggplot() +
geom_sf(data = world, colour = "black", fill = "lightgrey") +
geom_sf(data = occdf$grid, fill = "orange", colour = "black") +
theme_void()