Lesson 4. Understand EPSG, WKT and Other CRS Definition Styles
Learning Objectives
- Identify the
proj4
vsEPSG
vsWKT
crs format when presented with all three formats. - Look up a
CRS
definition inproj4
,EPSG
orWKT
formats using spatialreference.org.
On the previous pages, you learned what a coordinate reference system (CRS) is, the components of a coordinate reference system and the general differences between projected and geographic coordinate reference systems. On this page, you will cover the different ways that CRS
information is stored.
Coordinate Reference System Formats
There are numerous formats that are used to document a CRS
. Three common formats include:
- proj.4
- EPSG
- Well-known Text (WKT) formats.
Often you have CRS information in one format and you need to translate that CRS into a different format to use in a tool like Python
. Thus it is good to be familiar with some of the key formats that you are likely to encounter.
One of the most powerful websites to look up CRS strings is Spatialreference.org. You can use the search on the site to find an EPSG code. Once you find the page associated with your CRS of interest you can then look at all of the various formats associated with that CRS: EPSG 4326 - WGS84 geographic
PROJ or PROJ.4 strings
PROJ.4
strings are a compact way to identify a spatial or coordinate reference system. PROJ.4
strings are one of the formats that Geopandas can accept. However, note that many libraries are moving towards the more concise EPSG format.
Using the PROJ.4
syntax, you specify the complete set of parameters including the ellipse, datum, projection units and projection definition that define a particular CRS
.
Break down the proj.4 format
Below is an example of a proj.4
string:
+proj=utm +zone=11 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
Notice that the crs
information is structured using a string of characters and numbers that are combined using +
signs. The CRS
for your data are in the proj4
format. The string contains all of the individual CRS
elements that Python
or another GIS
might need. Each element is specified with a +
sign, similar to how a .csv
file is delimited or broken up by a ,
. After each +
we see the CRS
element being defined. For example +proj=
and +datum=
.
You can break down the proj4
string into its individual components (again, separated by + signs) as follows:
- +proj=utm: the projection is UTM, UTM has several zones.
- +zone=11: the zone is 11 which is a zone on the west coast, USA.
- datum=WGS84: the datum WGS84 (the datum refers to the 0,0 reference for the coordinate system used in the projection)
- +units=m: the units for the coordinates are in METERS.
- +ellps=WGS84: the ellipsoid (how the earth’s roundness is calculated) for the data is
WGS84
Note that the zone
is unique to the UTM projection. Not all CRS
will have a zone.
Also note that while California is above the equator - in the northern hemisphere - there is no N (specifying north) following the zone (i.e. 11N) South is explicitly specified in the UTM proj4 specification however if there is no S, then you can assume it’s a northern projection.
Geographic (lat / long) Proj.4 String
Next, look at another CRS definition.
+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
This is a lat/long or geographic projection. The components of the proj4
string are broken down below.
- +proj=longlat: the data are in a geographic (latitude and longitude) coordinate system
- datum=WGS84: the datum WGS84 (the datum refers to the 0,0 reference for the coordinate system used in the projection)
- +ellps=WGS84: the ellipsoid (how the earth’s roundness is calculated) is WGS84
Note that there are no specified units above. This is because this geographic coordinate reference system is in latitude and longitude which is most often recorded in Decimal Degrees.
Data Tip: the last portion of each proj4
string is +towgs84=0,0,0
. This is a conversion factor that is used if a datum
conversion is required.
EPSG codes
The EPSG
codes are 4-5 digit numbers that represent CRSs definitions. The acronym EPSG
, comes from the, now defunct, European Petroleum Survey Group. Each code is a four-five digit number which represents a particular CRS
definition.
Explore ESPG codes on spatialreference.org .
Import the worldBoundary layer that you’ve been working with in this module to explore the CRS
.
import os
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import earthpy as et
# Set working dir & get data
data = et.data.get_data('spatial-vector-lidar')
os.chdir(os.path.join(et.io.HOME, 'earth-analytics'))
# Import world boundary shapefile
worldBound_path = os.path.join("data", "spatial-vector-lidar", "global",
"ne_110m_land", "ne_110m_land.shp")
worldBound = gpd.read_file(worldBound_path)
worldBound.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
Notice that the CRS returned above, consists of two parts:
- ‘init’ which tells python that a CRS definition (ie EPSG code) will be provided and
- the epsg code itself epsg: 4326
How to Create a CRS Object in Python
You often need to define the CRS for a spatial object. For example in the previous lessons, you created new spatial point layers, and had to define the CRS that the point x,y locations were in.
To do this you completed the following steps:
- You manually created an array for a single point (x,y).
- You turned that x,y point into a shapely points object
- Finally convert that point object to a pandas GeoDataFrame
# Create a numpy array with x,y location of Boulder
boulder_xy = np.array([[476911.31, 4429455.35]])
# Create shapely point object
boulder_xy_pt = [Point(xy) for xy in boulder_xy]
# Convert to spatial dataframe - geodataframe -- assign the CRS using epsg code
boulder_loc = gpd.GeoDataFrame(boulder_xy_pt,
columns=['geometry'],
crs={'init': 'epsg:2957'})
# View crs of new spatial points object
boulder_loc.crs
/opt/conda/lib/python3.8/site-packages/pyproj/crs/crs.py:53: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
return _prepare_from_string(" ".join(pjargs))
<Projected CRS: EPSG:2957>
Name: NAD83(CSRS) / UTM zone 13N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Canada - 108°W to 102°W
- bounds: (-108.0, 48.99, -102.0, 84.0)
Coordinate Operation:
- name: UTM zone 13N
- method: Transverse Mercator
Datum: NAD83 Canadian Spatial Reference System
- Ellipsoid: GRS 1980
- Prime Meridian: Greenwich
WKT
or Well-known Text
It’s useful to recognize this format given many tools - including ESRI’s ArcMap and ENVI use this format. Well-known Text (WKT
) is a for compact machine- and human-readable representation of geometric objects. It defines elements of coordinate reference system (CRS
) definitions using a combination of brackets []
and elements separated by commas (,
).
Here is an example of WKT
for WGS84
geographic:
GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]
Notice here that the elements are described explicitly using all caps - for example:
- UNIT
- DATUM
Sometimes WKT structured CRS information are embedded in a metadata file - similar to the structure seen below:
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.01745329251994328,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]]
How to Look Up a CRS
The most powerful website to look-up CRS
information is the spatial reference.org website. This website has a useful search function that allows you to search for strings such as:
UTM 11N
orWGS84
Once you find the CRS
that you are looking for, you can explore definitions of the CRS
using various formats including proj4
, epsg
, WKT
and others.
Additional Resources
Practice Your GeoPandas Dataframes Skills: Import Line & Polygon Shapefiles
Import the data/week5/california/madera-county-roads/tl_2013_06039_roads
and data/week5/california/SJER/vector_data/SJER_crop.shp
shapefiles into Python
.
Call the roads object sjer_roads
and the crop layer sjer_crop_extent
.
Answer the following questions:
- What type of
Python
spatial object is created when you import each layer? - What is the
CRS
andextent
for each object? - Do the files contain, points, lines or polygons?
- How many spatial objects are in each file?
Leave a Comment