Using Python to Analyze Spatial Data Juan Carlos Méndez CTO @gkudos juan@gkudos.com
Topics ● Spatial Data? ● Python? ● Visualizing Geodata ● Analyzing Geodata (Basics)
About me ● Juan Carlos Méndez ○ CTO at @gkudos ○ GIS Consultant ○ Software Architect / Programmer / “Data Engineer” ○ https://github.com/dersteppenwolf ● Education ○ Systems Engineer- Universidad Nacional de Colombia, Bogotá ○ Telematics / eBusiness specialist - Escuela Colombiana de ingeniería ○ Information Engineering (Student) - Universidad de los Andes
https://github.com/dersteppenwolf/pycon Requirements ● Qgis ( http://www.qgis.org/es/site/ ) pip install -r requirements.txt brew install mapnik brew install spatialindex
Spatial Data?
Location, location, location! “You can buy the right home in the wrong location. You can change the structure, remodel it or alter the home's layout but, ordinarily, you cannot move it. It's attached to the land” http://bit.ly/2fz2ySD
Spatio-temporal data “...in Google, about 25 PB of data is being generated per day, and a significant portion of the data falls into the realm of spatio-temporal data...” Lee, J.-G., & Kang, M. (2015). Geospatial Big Data: Challenges and Opportunities. Big Data Research, 2(2), 74–81. http://doi.org/10.1016/j.bdr.2015.01.003
What for?
15 Global challenges facing humanity http://www.millennium-project.org/ millennium/challeng.html
How?
But... What is Spatial Data?
Spatial Data ● Located on the surface of the earth ● Coordinate Systems GIS (Geographic Information Systems) ● Body of Knowledge ● Tools ● Science
Spatial is special? Yes: ● Multidimensional ● Voluminous ● Special methods for analysis ● Updating: Slow, complex and expensive “Everything is related to everything else, but near things are more related than distant things” Tobler, W. 1970. A
Spatial is special? No: ● “spatial is not special, it’s just another column in the database...” Michael Terner http://bit.ly/2jZAWav
Spatial Data: Vector vs Raster
Vector http://gisgeography.com/spatial-data-types-vector-raster/
Spatial Data: Vector vs Raster
Map projection “...formal process which converts features between a spherical or ellipsoidal surface and a projection surface, which is often flat…” http://bit.ly/2bA9Szk
Spatial Data - File Formats ...lots… (open and proprietary)... ● Shapefile ● Geojson ● KML ● Filegeodatabase ● Geotiff ● Geopackage ● Spatialite ● … ...
Spatial is special? ...It can be complex... … using python it will be more fun and easy…
Python?
Python for Geospatial Data Lots of… ● Tools / Libraries ● Propietary / Open ● Desktop / Server ● Analysis / Visualization / ETL
Python for Geospatial Data ESRI ● Arcpy ○ Desktop: Automation / Customization. E.g. CartoDB Toolbox : Import Data From to Carto ○ Server: Geoprocessing as a “Web Service” ● Arcgis Web : ArcGIS API for Python (2017) QGIS ● Open source desktop GIS Tool written in C++, Python, Qt ● QGIS API E.g. CartoDB Plugin for QGis ● PyQGIS: Scripting using Python ● Server: Qgis Server python plugins
Python for Geospatial Data ● CKAN - web-based open source management system for the storage and distribution of open data (including geospatial data). ● GeoDjango - storing and manipulating geographic data using the Django ORM ● Geonode - web-based application and platform for developing geospatial information systems (GIS) and for deploying spatial data infrastructures (SDI)
Python for Geospatial Data ● pyshp - For reading and writing shapefiles (in pure Python) ● pyproj - For conversions between projections ● shapely - For geometry handling ● fiona - For making it easy to read/write geospatial data formats ● ogr/gdal - For reading, writing, and transforming geospatial data formats * ● Rasterio - reads and writes geospatial raster datasets
Show me the code
Describe Data
data/world_borders
Fiona ● https://github.com/Toblerity/Fiona ● Fiona is OGR's neat and nimble API for Python programmers. ● Fiona does reading and writing data formats. For this it uses OGR, the most popular open-source conversion system ● The OGR Simple Features Library is a C++ open source library providing read (and sometimes write) access to a variety of vector file formats including ESRI Shapefiles, S-57, SDTS, PostGIS, Oracle Spatial, and Mapinfo mid/mif and TAB formats.
Data Format Conversion
From CSV to SHP
Shapely ● https://github.com/Toblerity/Shapely ● Manipulation and analysis of geometric objects in the Cartesian plane ● With Shapely, you can do things like buffers, unions, intersections, centroids, convex hulls, and lots more.
Geocoding
What is geocoding? ● Geocoding is the process of transforming a description of a location—such as a pair of coordinates, an address, or a name of a place—to a location on the earth's surface. http://arcg.is/2kUedk7
GeoPy ● https://github.com/geopy/geopy ● geopy makes it easy for Python developers to locate the coordinates of addresses, cities, countries, and landmarks across the globe using third-party geocoders and other data sources ● geopy includes geocoder classes for the OpenStreetMap Nominatim, ESRI ArcGIS, Google Geocoding API (V3), Baidu Maps, Bing Maps API, Mapzen Search, Yandex, IGN France, GeoNames, NaviData, OpenMapQuest, What3Words, OpenCage, SmartyStreets, geocoder.us, and GeocodeFarm geocoder services.
Reverse Geocoding
Visualize?
Mapnik ● http://mapnik.org/ ● the core of geospatial visualization & processing ● mapnik combines pixel-perfect image output with lightning-fast cartographic algorithms, and exposes interfaces in C++, Python, and Node.
Mapnik ● Installing Mapnik on OS X with Homebrew https://github.com/mapnik/mapnik/wiki/MacInstallation_Homebrew ○ brew install mapnik ● Python bindings for mapnik https://github.com/mapnik/python-mapnik ● Stacks built with mapnik: OpenStreetMap , Mapbox , CartoDB , Stamen , MapQuest , Kosmtik
Mapnik Basics
Mapnik - XML
Mapnik Complex Styles
Mapnik Markers
Mapnik Composite Compositing operations affect the way colors and textures of different elements and styles interact with each other. Two main categories: color and alpha E.g. Multiply literally multiplies the color of the top layer by the color of each layer beneath, which usually means overlapping areas become darker.
What is Spatial Analysis?
Spatial Analysis Geospatial data is more than maps! What is Geoprocessing:? ● Geoprocessing is any GIS operation used to manipulate data. ● A typical geoprocessing operation takes an input dataset, performs an operation on that dataset, and returns the result of the operation as an output dataset, also referred to as derived data. ● Common geoprocessing operations: geographic feature overlay, feature selection and analysis, topology processing, and data conversion. ● Geoprocessing allows you to define, manage, and analyze geographic information used to make decisions. ● http://bit.ly/2k8l3P8
Spatial Analysis ● Spatial analysis includes any of the formal techniques which study entities using their topological, geometric, or geographic properties.
Spatial Analysis
Jupyter ● http://jupyter.org/ ● web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text ● Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more
Geopandas ● https://github.com/geopandas/geopandas ● pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. ● GeoPandas is a project to add support for geographic data to pandas objects.
PySAL ● https://github.com/pysal/pysal ● PySAL: Python Spatial Analysis Library ● Sub-packages pysal.cg – Computational Geometry pysal.core — Core Data Structures and IO pysal.esda — Exploratory Spatial Data Analysis pysal.inequality — Spatial Inequality Analysis pysal.region — Spatially Constrained Clustering pysal.spatial_dynamics — Spatial Dynamics pysal.spreg — Regression and Diagnostics pysal.weights — Spatial Weights pysal.network — Network Constrained Analysis pysal.contrib – Contributed Modules
Geoprocessing
Geoprocessing ● Spatial Joins ● Data Clipping ● Data Enrichment ● Data filtering ● Spatial Analysis
data/ideca/Loca.shp Localidades, Bogotá data/ideca/sitios_interes.shp Sitios de Interés, Bogotá data/ideca/clasificacion_sitios_interes.csv
juan@gkudos.com http://gkudos.com/ info@gkudos.com Thanks!!

Using python to analyze spatial data

  • 1.
    Using Python toAnalyze Spatial Data Juan Carlos Méndez CTO @gkudos juan@gkudos.com
  • 2.
    Topics ● Spatial Data? ●Python? ● Visualizing Geodata ● Analyzing Geodata (Basics)
  • 3.
    About me ● JuanCarlos Méndez ○ CTO at @gkudos ○ GIS Consultant ○ Software Architect / Programmer / “Data Engineer” ○ https://github.com/dersteppenwolf ● Education ○ Systems Engineer- Universidad Nacional de Colombia, Bogotá ○ Telematics / eBusiness specialist - Escuela Colombiana de ingeniería ○ Information Engineering (Student) - Universidad de los Andes
  • 4.
    https://github.com/dersteppenwolf/pycon Requirements ● Qgis (http://www.qgis.org/es/site/ ) pip install -r requirements.txt brew install mapnik brew install spatialindex
  • 5.
  • 6.
    Location, location, location! “Youcan buy the right home in the wrong location. You can change the structure, remodel it or alter the home's layout but, ordinarily, you cannot move it. It's attached to the land” http://bit.ly/2fz2ySD
  • 7.
    Spatio-temporal data “...in Google,about 25 PB of data is being generated per day, and a significant portion of the data falls into the realm of spatio-temporal data...” Lee, J.-G., & Kang, M. (2015). Geospatial Big Data: Challenges and Opportunities. Big Data Research, 2(2), 74–81. http://doi.org/10.1016/j.bdr.2015.01.003
  • 8.
  • 11.
    15 Global challenges facinghumanity http://www.millennium-project.org/ millennium/challeng.html
  • 12.
  • 17.
  • 18.
    Spatial Data ● Locatedon the surface of the earth ● Coordinate Systems GIS (Geographic Information Systems) ● Body of Knowledge ● Tools ● Science
  • 19.
    Spatial is special? Yes: ●Multidimensional ● Voluminous ● Special methods for analysis ● Updating: Slow, complex and expensive “Everything is related to everything else, but near things are more related than distant things” Tobler, W. 1970. A
  • 20.
    Spatial is special? No: ●“spatial is not special, it’s just another column in the database...” Michael Terner http://bit.ly/2jZAWav
  • 21.
  • 22.
  • 23.
  • 24.
    Map projection “...formal processwhich converts features between a spherical or ellipsoidal surface and a projection surface, which is often flat…” http://bit.ly/2bA9Szk
  • 27.
    Spatial Data -File Formats ...lots… (open and proprietary)... ● Shapefile ● Geojson ● KML ● Filegeodatabase ● Geotiff ● Geopackage ● Spatialite ● … ...
  • 28.
    Spatial is special? ...Itcan be complex... … using python it will be more fun and easy…
  • 29.
  • 30.
    Python for GeospatialData Lots of… ● Tools / Libraries ● Propietary / Open ● Desktop / Server ● Analysis / Visualization / ETL
  • 31.
    Python for GeospatialData ESRI ● Arcpy ○ Desktop: Automation / Customization. E.g. CartoDB Toolbox : Import Data From to Carto ○ Server: Geoprocessing as a “Web Service” ● Arcgis Web : ArcGIS API for Python (2017) QGIS ● Open source desktop GIS Tool written in C++, Python, Qt ● QGIS API E.g. CartoDB Plugin for QGis ● PyQGIS: Scripting using Python ● Server: Qgis Server python plugins
  • 32.
    Python for GeospatialData ● CKAN - web-based open source management system for the storage and distribution of open data (including geospatial data). ● GeoDjango - storing and manipulating geographic data using the Django ORM ● Geonode - web-based application and platform for developing geospatial information systems (GIS) and for deploying spatial data infrastructures (SDI)
  • 33.
    Python for GeospatialData ● pyshp - For reading and writing shapefiles (in pure Python) ● pyproj - For conversions between projections ● shapely - For geometry handling ● fiona - For making it easy to read/write geospatial data formats ● ogr/gdal - For reading, writing, and transforming geospatial data formats * ● Rasterio - reads and writes geospatial raster datasets
  • 34.
  • 36.
  • 37.
  • 40.
    Fiona ● https://github.com/Toblerity/Fiona ● Fionais OGR's neat and nimble API for Python programmers. ● Fiona does reading and writing data formats. For this it uses OGR, the most popular open-source conversion system ● The OGR Simple Features Library is a C++ open source library providing read (and sometimes write) access to a variety of vector file formats including ESRI Shapefiles, S-57, SDTS, PostGIS, Oracle Spatial, and Mapinfo mid/mif and TAB formats.
  • 42.
  • 43.
  • 44.
    Shapely ● https://github.com/Toblerity/Shapely ● Manipulationand analysis of geometric objects in the Cartesian plane ● With Shapely, you can do things like buffers, unions, intersections, centroids, convex hulls, and lots more.
  • 47.
  • 48.
    What is geocoding? ●Geocoding is the process of transforming a description of a location—such as a pair of coordinates, an address, or a name of a place—to a location on the earth's surface. http://arcg.is/2kUedk7
  • 49.
    GeoPy ● https://github.com/geopy/geopy ● geopymakes it easy for Python developers to locate the coordinates of addresses, cities, countries, and landmarks across the globe using third-party geocoders and other data sources ● geopy includes geocoder classes for the OpenStreetMap Nominatim, ESRI ArcGIS, Google Geocoding API (V3), Baidu Maps, Bing Maps API, Mapzen Search, Yandex, IGN France, GeoNames, NaviData, OpenMapQuest, What3Words, OpenCage, SmartyStreets, geocoder.us, and GeocodeFarm geocoder services.
  • 51.
  • 53.
  • 54.
    Mapnik ● http://mapnik.org/ ● thecore of geospatial visualization & processing ● mapnik combines pixel-perfect image output with lightning-fast cartographic algorithms, and exposes interfaces in C++, Python, and Node.
  • 55.
    Mapnik ● Installing Mapnikon OS X with Homebrew https://github.com/mapnik/mapnik/wiki/MacInstallation_Homebrew ○ brew install mapnik ● Python bindings for mapnik https://github.com/mapnik/python-mapnik ● Stacks built with mapnik: OpenStreetMap , Mapbox , CartoDB , Stamen , MapQuest , Kosmtik
  • 56.
  • 58.
  • 60.
  • 62.
  • 64.
    Mapnik Composite Compositing operations affectthe way colors and textures of different elements and styles interact with each other. Two main categories: color and alpha E.g. Multiply literally multiplies the color of the top layer by the color of each layer beneath, which usually means overlapping areas become darker.
  • 66.
    What is SpatialAnalysis?
  • 67.
    Spatial Analysis Geospatial datais more than maps! What is Geoprocessing:? ● Geoprocessing is any GIS operation used to manipulate data. ● A typical geoprocessing operation takes an input dataset, performs an operation on that dataset, and returns the result of the operation as an output dataset, also referred to as derived data. ● Common geoprocessing operations: geographic feature overlay, feature selection and analysis, topology processing, and data conversion. ● Geoprocessing allows you to define, manage, and analyze geographic information used to make decisions. ● http://bit.ly/2k8l3P8
  • 68.
    Spatial Analysis ● Spatialanalysis includes any of the formal techniques which study entities using their topological, geometric, or geographic properties.
  • 69.
  • 72.
    Jupyter ● http://jupyter.org/ ● webapplication that allows you to create and share documents that contain live code, equations, visualizations and explanatory text ● Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more
  • 73.
    Geopandas ● https://github.com/geopandas/geopandas ● pandasis an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. ● GeoPandas is a project to add support for geographic data to pandas objects.
  • 74.
    PySAL ● https://github.com/pysal/pysal ● PySAL:Python Spatial Analysis Library ● Sub-packages pysal.cg – Computational Geometry pysal.core — Core Data Structures and IO pysal.esda — Exploratory Spatial Data Analysis pysal.inequality — Spatial Inequality Analysis pysal.region — Spatially Constrained Clustering pysal.spatial_dynamics — Spatial Dynamics pysal.spreg — Regression and Diagnostics pysal.weights — Spatial Weights pysal.network — Network Constrained Analysis pysal.contrib – Contributed Modules
  • 78.
  • 79.
    Geoprocessing ● Spatial Joins ●Data Clipping ● Data Enrichment ● Data filtering ● Spatial Analysis
  • 80.
    data/ideca/Loca.shp Localidades, Bogotá data/ideca/sitios_interes.shp Sitios deInterés, Bogotá data/ideca/clasificacion_sitios_interes.csv
  • 82.