GIS for Business Analytics

Table of contents

Chapter 1What is GIS

Chapter 2Geocoding and Visualizing

Chapter 3Market Areas

Chapter 4Geocoding

Chapter 5Regression Modelling

Chapter 6Spatial Regressions

Chapter 7Spatial Autocorrelation

Chapter 8Spatial Modeling

Chapter 9Spatial Decision-making

Geocoding and Visualizing

Vector and Raster – A brief overview

What makes spatial data special, is its ability to represent real world phenomena with discrente and non-discrete boundaries. By discrete boundaries, we consider any features which may have strict and limited geometric shapes, such as roads, buildings, water areas, and census tracts (or other administrative boundaries). The limits of these boundaries are clearly defined, thus being called as discrete. For instance, a person is either in a building, or out of the building, in the water, or out of the water, in Toronto, or not in Toronto. This is very useful for understanding the relation of the location of things, and corresponds more often to what we call in GIS as a vector data model, in short, a vector (not to confuse with the mathematical concept of a vector).

Not all boundaries are clear and crisp, and thus we have a second data model at hand which we call the raster. The raster data model represents non-discrete boundaries best, such as changing levels of elevation, nutrient levels of the soil, and temperature. These data types do not have a fixed boundary, rather, behave as a surface where the values vary. Without firm lines of separation, different types of files are used to represent these types of data in the GIS environment.

The vector are usually related to shapefiles, while the raster files are called raster or grid files. By mathematical definition, vector format allow the description of a position and direction. In a GIS the vector format is a graphical representation of a geographical phenomenon without the effect of generalisation of a matrix value. The lines are thus analogically represented and cannot be split into cells, but are as a continuous stroke of simple geometric shape. One of the great advantages of vector is that due to their strict and simple geometric representation, an absolute accuracy can be given to a vector. This makes the vector format particularly useful for geocoding of locations in geographical space, where a precise set of coordinates are used to find an accurate location.

Adding and geocoding your first data

To geocode addresses, a set of structured information pertaining street segments within a numerical structure for each number for the address must exist. This corresponds to the reference database segment where geocoding will be carried out. One can obtain the location of a given address with this database, which nowadays pertains often an available online service, where the geocoding database is made available (e.g. Google Maps). The process is carried by locating the street segment where the name and the street number are returned within the geocoding database. This will correspond to a point on the map, with a set of geographic coordinates (latitude and longitude).

  1. In QGIS, navigate to Plugins and chose Manage and Install Plugins. Here you can chose the plugins that allow QGIS to do certain additional tasks and execute different functions. You want to install the MMQGIS plugin

    You should now be able to see MMQGIS in the menu toolbar.

  2. Proceed to downloading the Starbucks dataset, including a selection of Starbucks in Toronto here. Create a directory called starbucks on your desktop and move the file to the newly created directory.

  3. Click on MMQGIS (which will be located on the menu at the top) > Geocode > Geocode CSV with Google/OpenStreetMap.In your CSV, you’ll need separate fields for each piece of geographic information of your Starbucks location. You have these as Street 1, City, Country, and Postal Code. This operation may take some time. Please see video below for further details.

  4. The location of business is utmost importance to make them more competitive, and support their economic growth. Knowing their geographical locations is complex, specially in the retail sector. Geocoding eases the tasks of understanding locations of retail and business activity, as well as allows for better management of core locations. Geographic Information Systems (GIS) allow to generate maps, but most of all support location-based decisions that rely on regional and local intelligence. Geocoding is a cornerstone for this process. As we saw, we were able to import data into the GIS environment, and have managed to integrate this data in a meaningful way. We will later assess the dynamics of the location of these Starbucks stores. We can divide the importance of geocoding in three distinct thresholds, all that support more efficient business decisions:

    • (i) spatial patterns: As we display addresses in a map as vector points, we can see how business change both over time, as more locations represented as points show up, as well as over space, as certain businesses are closer to eachother and we can understand spatially-explicit patterns. This allows us to understand the changes in the retail environment.
    • (ii) customer analytics: by geocoding addresses of our customers for instance, we can develop market strategies that are adequate for our success. This is especially the case as we will learn later on, when we relate different data sets of demographic information with customer and store locations.
    • (iii) surrounding infrastrucutre: finally, geocoding enables the opportunity for us to understand the surrounding physical environment. For instance, are certain Starbucks located at intersection from major roads? How does population density and the surrounding business environment relate to our store locations?
    These are the questions that can be answered by knowing the surrounding infrastructure from our cumulative locations.