Accessing OTN Data via OBIS
Overview
Teaching: 15 min
Exercises: 15 minQuestions
What does OBIS hold for the Ocean Tracking Network?
How do I query OTN records by node, dataset, or species?
How can I bring OTN occurrence data into R or Python for mapping?
Objectives
Recognize what OTN contributes to OBIS (occurrence-focused).
Filter OBIS by OTN node, dataset, taxon, region, and time.
Load OTN records into R or Python and make a quick map.
What is OBIS and how does OTN fit in?
- OBIS (Ocean Biodiversity Information System) is the global hub for marine species occurrence data.
- The Ocean Tracking Network (OTN) is one of OBIS’s regional nodes.
OTN contributes tagging and tracking metadata summarized as occurrence records (e.g., tag releases, detection events). - To focus only on OTN-contributed data, use the OTN node UUID:
68f83ea7-69a7-44fd-be77-3c3afd6f3cf8
Anatomy of an OBIS query (OTN-focused)
An OBIS request is just a URL with filters.
Here’s a simple example, limited to blue sharks contributed by OTN:
https://api.obis.org/v3/occurrence?
nodeid=68f83ea7-69a7-44fd-be77-3c3afd6f3cf8&
scientificname=Prionace%20glauca&
startdate=2000-01-01&
size=100
What each part means
nodeid — restrict to the OTN node
scientificname — filter by species (e.g., Prionace glauca)
startdate / enddate — filter by time window
geometry — filter by region (polygon or bounding box in WKT)
datasetid — limit to one OTN dataset
size / from — control paging through large results
fields — choose only the columns you need (faster, smaller)
Minimal recipes: your first OBIS query
The smallest working recipe is just: species + OTN node. Here’s an example pulling blue shark (Prionace glauca) records contributed by OTN.
R (robis)
# install.packages("robis") # once
library(robis)
OTN <- "68f83ea7-69a7-44fd-be77-3c3afd6f3cf8"
blue <- occurrence("Prionace glauca", nodeid = OTN, size = 500)
head(blue)
Python (pyobis)
# pip install pyobis pandas # once
from pyobis import occurrences
OTN = "68f83ea7-69a7-44fd-be77-3c3afd6f3cf8"
blue = occurrences.search(
scientificname="Prionace glauca",
nodeid=OTN,
size=500
).execute()
print(blue.head())
Adding filters
Just like with WFS, you can add filters to make queries more precise. The most useful are:
- Time window (
startdate
,enddate
) - Place (
geometry=
WKT polygon or bbox in lon/lat) - Dataset (
datasetid=
) - Fields (
fields=
→ return only the columns you need)
R
# Time filter (since 2000)
blue_time <- occurrence("Prionace glauca", nodeid = OTN, startdate = "2000-01-01")
# Place filter (box in Gulf of St. Lawrence)
wkt <- "POLYGON((-70 40, -70 50, -55 50, -55 40, -70 40))"
blue_space <- occurrence("Prionace glauca", nodeid = OTN, geometry = wkt)
# Dataset filter (replace with an actual dataset UUID)
# blue_ds <- occurrence("Prionace glauca", nodeid = OTN, datasetid = "DATASET-UUID")
# Slim down fields
blue_lean <- occurrence("Prionace glauca", nodeid = OTN,
fields = c("scientificName","eventDate","decimalLatitude","decimalLongitude","datasetName"))
Python
# Time filter (since 2000)
blue_time = occurrences.search(
scientificname="Prionace glauca", nodeid=OTN, startdate="2000-01-01"
).execute()
# Place filter (box in Gulf of St. Lawrence)
wkt = "POLYGON((-70 40, -70 50, -55 50, -55 40, -70 40))"
blue_space = occurrences.search(
scientificname="Prionace glauca", nodeid=OTN, geometry=wkt
).execute()
# Dataset filter (replace with an actual dataset UUID)
# blue_ds = occurrences.search(
# scientificname="Prionace glauca", nodeid=OTN, datasetid="DATASET-UUID"
# ).execute()
# Slim down fields
blue_lean = occurrences.search(
scientificname="Prionace glauca", nodeid=OTN,
fields="scientificName,eventDate,decimalLatitude,decimalLongitude,datasetName"
).execute()
Tips
- Always drop NAs in latitude/longitude before mapping.
- Dates come back as text — convert to
Date
(R) ordatetime
(Python) if you need time plots. - For large datasets, paginate with
size
+from
(Python) or repeat calls withfrom
(R).
Follow-Along: “Where are the Blue Sharks?”
We’ll walk through a full workflow:
- pull blue shark (Prionace glauca) records contributed by OTN,
- focus on the NW Atlantic,
- make an interactive map and a couple of time-series views.
Install once
pip install pyobis pandas folium matplotlib
1) Imports & Setup
Load the libraries, set the OTN node UUID, species, and a bounding box (WKT polygon) for the NW Atlantic.
from pyobis import occurrences, dataset
import pandas as pd
import folium
from folium.plugins import HeatMap, MarkerCluster
import matplotlib.pyplot as plt
OTN = "68f83ea7-69a7-44fd-be77-3c3afd6f3cf8"
SPECIES = "Prionace glauca" # blue shark
WKT = "POLYGON((-80 30, -80 52, -35 52, -35 30, -80 30))"
2) Peek at OTN datasets
This shows which datasets under OTN actually contain blue shark records.
ds = dataset.search(nodeid=OTN, limit=100, offset=0).execute()
pd.DataFrame(ds["results"])[["id","title"]].head(10)
3) Query OBIS for Blue Shark
We add filters: species + node + region + time window.
df = occurrences.search(
scientificname=SPECIES,
nodeid=OTN,
geometry=WKT,
startdate="2000-01-01",
size=5000,
fields="id,scientificName,eventDate,decimalLatitude,decimalLongitude,datasetName"
).execute()
4) Clean the results
Drop rows without coordinates, parse event dates, and check the shape.
df = df.dropna(subset=["decimalLatitude","decimalLongitude"]).copy()
df["eventDate"] = pd.to_datetime(df["eventDate"], errors="coerce")
df = df.dropna(subset=["eventDate"])
print(df.shape)
df.head()
5) Quick sanity checks
See which datasets dominate and what the date range looks like.
print("Top datasets:\n", df["datasetName"].value_counts().head(10))
print("Date range:", df["eventDate"].min(), "→", df["eventDate"].max())
6) Interactive Map
Plot both individual points (sampled so the map stays fast) and a density heatmap.
center = [df["decimalLatitude"].median(), df["decimalLongitude"].median()]
m = folium.Map(location=center, zoom_start=4, tiles="OpenStreetMap")
# Markers
sample = df.sample(min(len(df), 2000), random_state=42)
mc = MarkerCluster(name="Blue shark points").add_to(m)
for _, r in sample.iterrows():
tip = f"{r['scientificName']} • {str(r['eventDate'])[:10]}"
folium.CircleMarker([r["decimalLatitude"], r["decimalLongitude"]],
radius=3, tooltip=tip).add_to(mc)
# Heatmap
HeatMap(df[["decimalLatitude","decimalLongitude"]].values.tolist(),
name="Density", radius=14, blur=22, min_opacity=0.25).add_to(m)
folium.LayerControl().add_to(m)
m.save("otn_blue_shark_map.html")
print("Saved → otn_blue_shark_map.html")
7) Yearly & Monthly Patterns
Summarize how records are distributed in time.
# Records per year
df["eventDate"].dt.year.value_counts().sort_index().plot(kind="bar", title="Records per year")
plt.xlabel("Year"); plt.ylabel("Records"); plt.show()
# Records by month
df["eventDate"].dt.month.value_counts().sort_index().plot(kind="bar", title="Records by month")
plt.xlabel("Month"); plt.ylabel("Records"); plt.show()
Common pitfalls & quick fixes
- Empty results? Loosen filters (remove
geometry
, broaden dates, dropdatasetid
). - Slow/large queries? Use
fields=...
, smaller regions, and paginate withsize
+from
. - Missing coordinates? Drop NA lat/lon before mapping.
- CRS confusion? OBIS returns WGS84 (EPSG:4326); mapping expects lon/lat.
Exercises
- Query a different species (e.g.,
"Gadus morhua"
) restricted to OTN. - Find a specific OTN dataset and filter occurrences by
datasetid
. - Compare recent vs historical records with
startdate
/enddate
. - Change the
geometry
to your own study region and map results. -
Save results:
- Python:
df.to_csv("export.csv", index=False)
- R:
write.csv(salmon, "export.csv", row.names = FALSE)
- Python:
Key Points
OBIS hosts species occurrence records.
OTN is an OBIS node; UUID: 68f83ea7-69a7-44fd-be77-3c3afd6f3cf8.
Use robis (R) or pyobis (Python) for programmatic queries.
Keep queries lean with fields, geometry, time window, and paging.