GISP Knowledge – Public

search on for the Esri certification

  1. Conceptual foundations

    1. Knowledge of spatial relationships such as distance (e.g., horizontal and vertical), direction, and topology (e.g., adjacency, connectivity, and overlap) that are particularly relevant to geospatial data analysis 

this just has everything in great detail:


a = b

Topologically equal. Also (ab = a) ∧ (ab = b)


ab = ?

a and b are disjoint, have no point in common. They form a set of disconnected geometries.


ab ≠ ?


(ab ≠ ?) ∧ (aοbο = ?)

a touches b, they have at least one boundary point in common, but no interior points.


ab = b


aοb = b

b lies in the interior of a (extends Contains). Other definitions: "no points of b lie in the exterior of a", or "Every point of b is a point of (the interior of) a".

Covered By  



ab = a


a crosses b at some point


a and b have common interior points

GISP Knowledge - Public
GISP Knowledge – Public

GISP Knowledge - Public
GISP Knowledge – Public

Directional relations

Directional relations can again be differentiated into external directional relations and internal directional relations. An internal directional relation specifies where an object is located inside the reference object while an external relations specifies where the object is located outside of the reference objects.

  • Examples for internal directional relations: left; on the back; athwart, abaft

  • Examples for external directional relations: on the right of; behind; in front of, abeam, astern

Distance relations

Distance relations specify how far is the object away from the reference object.

  • Examples are: at; nearby; in the vicinity; far away

    Euclidean distance is calculated as:

    D = sq root [(x1–x2)**2.0 + (y1–y2)**2.0]

    Where (x1,y1) is the coordinate for point A, (x2,y2) is the coordinate for point B, and D is the straight-line distance between points A and B.

    GISP Knowledge - Public
    GISP Knowledge – Public

    Manhattan distance is calculated as:

    D = abs(x1–x2) + abs(y1–y2)

    Where (x1, y1) is the coordinate for point A, (x2, y2) is the coordinate for point B, and D is the vertical plus horizontal difference between points A and B. It is the distance you must travel if you are restricted to north/south and east/west travel only. This method is generally more appropriate than Euclidean distance when travel is restricted to a street network in cases where actual street network travel costs are not available.

    GISP Knowledge - Public
    GISP Knowledge – Public

    1. Knowledge of standard spatial data models, including the nature of vector, raster, and object-oriented models, in the context of spatial data used in the workplace

    Spatial models (at some places GIS models) might describe basic properties and processes for a set of spatial features (Bolstad 13 – you have heard about this )

    The aim is to study spatial objects or phenomena in the real world

    According to Bolstad

    Cartographic models: temporally static, combined spatial datasets, operations and functions for problem-solving

    Spatio-temporal models dynamics in space and time, time-driven processes

    Network models: modeling of resources (flow, accumulation) as limited to networks

    Goodchild 2003

    Data models: Entities and fields as conceptual models

    Static modeling: taking inputs to transform them into outputs using sets of tools and functions

    Dynamic modeling: iterative, sets of initial conditions, apply transformations to obtain a series of predictions at time intervals

    DeMers 2005

    Based on purpose descriptive – passive, description of the study area prescriptive – active, imposing best solution

    Based on methodology stochastic – based on statistical probabilities deterministic – based on known functional linkages and interactions

    Based on logic inductive – general models based on ind. data deductive – from general to specific using known factors and relationships

    The aim of spatial modeling is to derive a meaningful representation of events, occurrences or processes by making use of the power of spatial analysis

    Vector data are composed of points, lines, polygons

    Points represent discrete locations on the ground

    Lines represent linear features, such as rivers, roads and transmission cables

    Arcs are composed of nodes and vertices. Arcs begin and end at nodes, and may have 0 or more vertices between the nodes. The vertices define the shape of the arc along its length. Arcs which connect to each other will share a common node.

    Polygons form bounded areas. In the point and line datasets shown above, the land masses, islands, and water features are represented as polygons. Polygons are formed by bounding arcs, which keep track of the location of each polygon

    ASCII Coordinate Data files may also be used in ArcGIS. Point layers can be created from files containing single records for individual points. 

    Raster datasets are composed of rectangular arrays of regularly spaced square grid cells. Each cell has a value, representing a property or attribute of interest. While any type of geographic data can be stored in raster format, raster datasets are especially suited to the representation of continuous, rather than discrete, data. Some examples of continuous data are:

    • oil depth across an open-water oil spill

    • soil pH

    • reflectance in a certain band in the electromagnetic spectrum

    • elevation

    • landform aspect (compass bearing of steepest downward descent)

    • salinity of a water bod

    Pixel or cell? All raster datasets are stored in similar formats. You will want to know the difference between a pixel and a cell, even though they are functionally equivalent. A pixel (short for PICture ELement) represents the smallest resolvable "piece" of a scanned image, whereas a cell represents a user-defined area representing a phenomenon. A pixel is always a cell, but a cell is not always a pixel.

    There are many types of raster data you may be familiar with:

    • grids (ArcGIS & ArcInfo specific)

    • graphical images (TIFF, JPEG, BMP, GIF, etc.)

    • USGS DEM (Digital Elevation Model)

    • remotely-sensed images (Landsat, SPOT, AVIRIS, AVHRR, Imagine IMG, digital orthophotos)

    geodatabase is an object oriented spatial model

    Geodatabase data model

    – Use a relational database that stores geographic data. A type of database in which the data is organized across several tables. Tables are associated with each other through common fields. Data items can be recombined from different files

    -A container for storing spatial and attribute data and the relationships that exist among them

    -And their associated attributes can be structured to work together as an integrated system using rules, relationships, and topological associations

    Primary (basic) components 

    – feature classes, 

    – feature datasets, 

    – nonspatial tables. 

    complex components building on the basic components: 

    – topology, 

    – relationship classes, 

    – geometric networks 

    A feature class is a geographic feature include points, lines, polygons, and annotation feature class.

    Feature classes may exist independently in a geodatabase as stand-alone feature classes or you can group them into feature datasets 

    A feature dataset is composed of feature classes that have been grouped together so they can participate in topological relationships with each other. All the feature classes in a feature dataset must share the same spatial reference (or coordinate system) 

    Edits you make to one feature class may result in edits being made automatically to some or all of the other feature classes in the feature dataset 

    Feature class tables and nonspatial attribute tables. 

    Both types of tables are created and managed in ArcCatalog and edited in ArcMap. Both display in the traditional row-and-column format. The difference is that feature class tables have one or more columns that store feature geometry. 

    Nonspatial tables contain only attribute data (no feature geometry) and display in ArcCatalog with the table icon . They can exist in a geodatabase as stand-alone tables, or they can be related to other tables or feature classes

    In a geodatabase, you can model each of these real-world networks with a geometric network. Starting with simple point and line feature classes, you use ArcCatalog to create a geometric network that will enable you to answer questions such as: Which streams will be affected by a proposed dam? Which areas will be affected by a water main repair? What is the quickest route between two points in the network? 

    Relationship Classes – In a geodatabase, relationship classes provide a way to model real-world relationships that exist between objects such as parcels and buildings or streams and water sample data. By using relationship classes, you can make your GIS database more accurately reflect the real world and facilitate data maintenance.

    1-1 relationship – each object of the origin table/feature class can be related to zero or one object of the destination table/feature class

    1-Many relationship – each object in the origin table/feature class can be related to the multiple objects in the destination table/feature class

    Many-Many relationship – multiple objects of the origin table/feature class can be related to multiple objects of the destination table/feature class

    1. Understanding of the conceptual foundations on which geographic information systems (GIS) are based, including the problem of representing change over time and the imprecision and uncertainty that characterizes all geographic information 

    Like, what? of course it’s difficult to spatially represent change of time. and yes, the world is constantly changing and the world is in a globe and GIS is 2D – skipped!

    1. Knowledge of earth geometry and its approximations, including geoids, ellipsoids, and spheres

    geoid is the shape that the surface of the oceans would take under the influence of Earth's gravitation and rotation alone, in the absence of other influences such as winds and tides

    In geodesy, a reference ellipsoid is a mathematically defined surface that approximates the geoid, the truer figure of the Earth, or other planetary body. Because of their relative simplicity, reference ellipsoids are used as a preferred surface on which geodetic network computations are performed and point coordinates such as latitude, longitude, and elevation are defined.

    In geometric geodesy, two standard problems exist:

    First (direct) geodetic problem

    Given a point (in terms of its coordinates) and the direction (azimuth) and distance from that point to a second point, determine (the coordinates of) that second point.

    Second (inverse) geodetic problem[edit]

    Given two points, determine the azimuth and length of the line (straight line, arc or geodesic) that connects them.

    In the case of plane geometry (valid for small areas on the Earth's surface) the solutions to both problems reduce to simple trigonometry. On the sphere, the solution is significantly more complex, e.g., in the inverse problem the azimuths will differ between the two end points of the connecting great circle, arc, i.e. the geodesic.

    On the ellipsoid of revolution, geodesics may be written in terms of elliptic integrals, which are usually evaluated in terms of a series expansion; for example, see Vincenty's formulae.

    In the general case, the solution is called the geodesic for the surface considered. The differential equations for the geodesic can be solved numerically.

    The geoid surface is irregular, unlike the reference ellipsoid which is a mathematical idealized representation of the physical Earth, but considerably smoother than Earth's physical surface

    two main reference surfaces have been established to approximate the shape of the Earth. One reference surface is called the Geoid, the other reference surface is the ellipsoid.

    The deviation between the Geoid and an ellipsoid is called the geoid separation (N) or geoid undulation

    The Geoid is used to describe heights. In order to establish the Geoid as reference for heights, the ocean’s water level is registered at coastal places over several years using tide gauges (mareographs). Averaging the registrations largely eliminates variations of the sea level with time. The resulting water level represents an approximation to the Geoid and is called the mean sea level

    GISP Knowledge - Public
    GISP Knowledge – Public

    The height determined with respect to a tide-gauge station is known as the orthometric height

    Obviously, there are several realizations of local mean sea levels (also called local vertical datums) in the world

    GISP Knowledge - Public
    GISP Knowledge – Public

    The most convenient geometric reference is the oblate ellipsoid (figure below). It provides a relatively simple figure which fits the Geoid to a first order approximation, though for small scale mapping purposes a sphere may be used. An ellipsoid is formed when an ellipse is rotated about its minor axis. This ellipse which defines an ellipsoid or spheroid is called a meridian ellipse (notice that ellipsoid and spheroid are used here as equivalent and interchangeable words

    GISP Knowledge - Public
    GISP Knowledge – Public

    The Sphere – As can be seen from the dimensions of the Earth ellipsoid, the semi-major axis a and the semi-minor axis b differ only by a bit more than 21 kilometres (figure below). A better impression on the Earth's dimensions may be achieved if we refer to a more "human scale". Considering a sphere of approximately 6 metre in diameter then the ellipsoid is derived by compressing the sphere at each pole by 1 cm only. This compression is rather small compared to the dimension of the semi-major axis a.

    GISP Knowledge - Public
    GISP Knowledge – Public

    The most important global (or geocentric) spatial reference system for the GIS community is the International Terrestrial Reference System (ITRS). It is a three-dimensional coordinate system with a well-defined origin (the centre of mass of the Earth) and three orthogonal coordinate axes (X,Y,Z). The Z-axis points towards a mean Earth north pole. The X-axis is oriented towards a mean Greenwich meridian and is orthogonal to the Z-axis. The Y-axis completes the righthanded reference coordinate system (figure (a) below).

    GISP Knowledge - Public
    GISP Knowledge – Public

    Global horizontal datums, such as the ITRF2000 or WGS84, are also called geocentric datums because they are geocentrically positioned with respect to the centre of mass of the Earth

    1. Knowledge of georeferencing systems, including coordinate systems, spatial projections, and horizontal and vertical datums

    ‘To georeference’  the act of assigning locations to atoms of information

    Is essential in GIS, since all information must be linked to the Earth’s surface

    The method of georeferencing must be:

    Unique, linking information to exactly one location

    Shared, so different users understand the meaning of a georeference

    Persistent through time, so today’s georeferences are still meaningful tomorrow

    georeferences are metric (define location using measures of distance from fixed places), based on ordering (street addresses in most parts of the world order houses along streets), nominal (place names do not involved ordering or measuring)

    A spatial reference system (SRS) or coordinate reference system (CRS) is a coordinate-based local, regional or global system used to locate geographical entities. A spatial reference system defines a specific map projection, as well as transformations between different spatial reference systems. Spatial reference systems are defined by the OGC's Simple feature access using well-known text, and support has been implemented by several standards-based geographic information systems. Spatial reference systems can be referred to using a SRID integer, including EPSG codes defined by the International Association of Oil and Gas Producers.

    What kinds of things can be distorted with different map projections?

    Distance ? Direction ? Shape ? Area

    GISP Knowledge - Public
    GISP Knowledge – Public

    Mercator Projection – Developed by Dutch cartographer Gerardus Mercator in 1569 Preserves shape & direction Used widely for navigation charts because direction is preserved.

    Transverse Mercator

    GISP Knowledge - Public
    GISP Knowledge – Public

    Three main types of map projections

    Cylindrical, conic, azimuthal (planar)

    A DATUM is a model of the Earth as a spheroid

    2. A curved surface (e.g., portions of the earth) gets distorted when represented on a flat map ? Map projections – transforming coordinates from a curved Earth to a flat map 

    A geodetic datum is a set of control points whose geometric relationships are known, either through measurement or calculation

    Datums have two components: ? The reference ellipsoid ? A set of survey points Both the shape of the spheroid and its position relative to the earth are important

    Cylindrical equal-area projections – straight meridians and parallels – meridians are equally spaced and the parallels are unequally spaced – area is true, shape and scale get distorted near the upper and lower regions of the map

    Transverse mercator projections – projecting the sphere onto a cylinder tangent to a meridian (line of longitude)

    UTM – Universal Transverse Mercator – a global coordinate system – UTM zones are 6 degrees so many studies will fit into this zone – broad study areas

    Mercator – shapes are true, but area gets distorted (conformal)

    Azimuthal Equidistant – planar (tangent) – used for air route distances – distances measured from the center are true – distortion of other properties increases away from the center point

    GISP Knowledge - Public
    GISP Knowledge – Public

    Conic projections – generated by projecting a spherical surface onto a cone – distorts scale and distance except along standard parallels – areas are proportional and directions are true in limited areas – used in countries with a larger east-west than north-south extent

    -area and shape are distorted away from standard parallels – directions are true in limited areas

    GISP Knowledge - Public
    GISP Knowledge – Public

    GISP Knowledge - Public
    GISP Knowledge – Public

    The State Plane Coordinate System (SPCS) uses a unique set of projection parameters for each of the 50 states 

    Uses either a Transverse Mercator or Lambert’s conformal conic projection

    Suggestions from Theobald (1999: 42) on Selecting a Projection 

    ? If you are making a fairly detailed map, for example a city, or requirements for accuracy is minimal, then you may not have to worry so much about which projection to use. 

    ? If you are making a map of a regional to continental to global scale OR are interested in precise shape, area or distance measurements then you should choose carefully the projection. 

    ? For many study areas there is already standard projects, such as State Plane for county or city governments or UTM for state governments. 

    ? Three factors to consider related to accuracy: Latitude of area, extent and theme 

    – Latitude: 

    ? Low-latitude areas (near equator) use a conical projection 

    ? Polar regions use a azimuthal planar projection 

    – Extent 

    ? Broad in East-West (e.g., the US) use a conical projection 

    ? Broad in North-South (e.g., Africa) use a transverse-case cylindrical projection 

    – Thematic 

    ? If you are doing an analysis that compares different values in different locations, typically an equal-area projection will be used

    Horizontal datums are used for describing a point on the Earth's surface, in latitude and longitude or another coordinate system. Vertical datums measure elevations or depths.

    Horizontal datum

    The horizontal datum is the model used to measure positions on the Earth. A specific point on the Earth can have substantially different coordinates, depending on the datum used to make the measurement. The WGS 84 datum, which is almost identical to the NAD83 datum used in North America and the ETRS89 datum used in Europe, is a common standard datum.

    Vertical datum

    A vertical datum is used as a reference point for elevations of surfaces and features on the Earth including terrain, bathymetry, water levels, and man made structures.

    Vertical datums are either: tidal, based on sea levels; gravimetric, based on a geoid; or geodetic, based on the same ellipsoid models of the Earth used for computing horizontal datums. – could be sea level – so mean sea level

    An example of a gravity-based geodetic datum is NAVD88, used in North America, which is referenced to a point in Quebec, Canada. Ellipsoid-based datums such as WGS84, GRS80 or NAD83 use a theoretical surface that may differ significantly from the geoid.

    1. Cartography and Visualization 

      1. Knowledge of contour mapping

    A contour line (also isoline, isopleth, or isarithm) of a function of two variables is acurve along which the function has a constant value.[1] It is a cross-section of the three-dimensional graph of the function f(x, y) parallel to the x, y plane. In cartography, a contour line (often just called a "contour") joins points of equal elevation (height) above a given level, such as mean sea level.[2] A contour map is a map illustrated with contour lines, for example a topographic map, which thus shows valleys and hills, and the steepness of slopes.[3] The contour interval of a contour map is the difference in elevation between successive contour lines.[4]

    iso – equal – equal distances between lines

    isoline and isarithm – covers all types of contour lines

    isogon – contour line for a variable which measures direction

    isocline – a line joining points with equal slope

    Equidistants – isodistances – equal distance from a given point, line, polyline

    isopleths – contour lines that depict a variable which cannot be measured at a point, but which instead must be calculated from data collected over an area (population density) – can be done using interpolation

    isobar – line of equal or constant pressure

    isallobars – lines joining points of equal pressure change during a specific time interval

    isopycnal – constant density

    isotherm – line that connects points on a map that have the same temperature

    isogeotherm – line of equal mean annual temperature

    isocheim – line of equal mean winter temperature

    isothere – line of equal mean summer temperature

    isohel – line of equal or constant solar radiation

    isohyet – line joining points of equal precipitation

    isohume – line of constant relative humidity

    isodrosotherm – line of equal or constant dew point

    1. Knowledge of basic physical geography (e.g., types of boundaries, continents, landforms, and topography)

    Physical geography (also known as geosystems or physiography) is one of the two major sub-fields of geography.[1][2][3] Physical geography is that branch of natural science which deals with the study of processes and patterns in the natural environment like the atmosphere, hydrosphere,biosphere, and geosphere, as opposed to the cultural or built environment, the domain of human geography.

    from intro to geography text book:

    Basic Physical Geography

    two types of forces produce variations on the surface of the earth called landforms

    1. forces that push, move and raise the earth's surface

    2. forces that scour, wash and wear down the surface

    tectonic – generated from within the earth

    tectonic forces – 2 types

    1. diastrophic – great pressure acting on the plates that deforms them by folding, twisting, warping, breaking or compressing rock

    2. volcanism – force that transports heated material to or toward the surface of the earth

    diastrophism – geologists can trace the history of the development of a region

    broad warping – changing weight of a large region, movement of continents may bow an entire continent

    warping or bending effect and a ridge or series of parallel folds may develop

    faulting – fault is a break or fracture in rock along which movement has taken place

    escarpment – steep slope

    rift valley – separation away from fault causes sinking of land

    seismic waves – vibrations which cause earth movement

    earthquake, volcanic eruption or underwater landslide occurs below an ocean, jolts water above, causes a tsunami

    three kinds of gradation processes

    1. weather

      1. breakdown and decomposition of rocks and minerals at or near the earth's surface from water, air and temperature called weathering – both mechanical and chemical processes

      2. mechanical weathering – physical disintegration of earth materials at or near the surface – large rocks broken into smaller pieces

      3. three important types of weathering – frost action, development of salt crystals, root action

      4. chemical weathering – rocks decompose rather than to disintegrate

      5. oxidation, hydrolysis, and carbonation – depends on availability of water, less chemical weathering in dry places

    2. mass movement

      1. downslope movement of material due to gravity is – avalanches and landslides

      2. talus – accumulation of rock particles at the base of hills and mountains

    3. erosional agents and deposition

      1. wind, water, glaciers – carve existing landforms into new shapes

    landform regions – large section of the earth's surface where a great deal of homogeneity occurs among the types of landforms that characterize it

    1. mountains

    2. plains

    3. plateau

    Types of Boundaries

    lithosphere broken into 12 large and many small, rigid plates

    theory of plate tectonics – plates slide or drift very slowly over the heavy semimolten asthenosphere

    divergent plate boundaries – boundaries where plate move away from each other

    transform boundaries – one plate slides horizontally past another plate

    convergent boundaries – two plates move toward each other

    GISP Knowledge - Public
    GISP Knowledge – Public

    Continents are understood to be large, continuous, discrete masses of land, ideally separated by expanses of water.

    From the perspective of geology or physical geography, continent may be extended beyond the confines of continuous dry land to include the shallow, submerged adjacent area (the continental shelf)[7] and the islands on the shelf (continental islands), as they are structurally part of the continent.

    A landform is a natural feature of the Earth's surface. Landforms together make up a given terrain, and their arrangement on the landscape or the study of same is known as topography. Typical landforms include hills, mountains, plateaus, canyons, valleys, as well as shoreline features such as bays, peninsulas, and seas, including submerged features such as mid-ocean ridges, volcanoes, and the great ocean basins.

    Topography is a field of geoscience and planetary science comprising the study ofsurface shape and features of the Earth and other observable astronomical objectsincluding planets, moons, and asteroids. It is also the description of such surface shapes and features (especially their depiction in maps). The topography of an area could also mean the surface shape and features themselves.

    Techniques of topography

    1. Direct Survey

    2. Remote Sensing

      1. Remote sensing is a general term for geodata collection at a distance from the subject area.

      2. Aerial and satellite imagery

      3. Photogrammetry

      4. Radar and sonar

    Forms of topographic data

    Terrain is commonly modelled either using vector (triangulated irregular network or TIN) or gridded (Raster image) mathematical models

    1. Raw survey data

      1. Topographic survey information is historically based upon the notes of surveyors. 

    2. Remote sensing data

    3. Topographic Mapping

      1. In its contemporary definition, topographic mapping shows relief. In the United States, USGStopographic maps show relief using contour lines. The USGS calls maps based on topographic surveys, but without contours, "planimetric maps."

    4. Digital elevation modeling

    5. Topological modeling

    1. Understanding of how data collection methods influence map design and representation

    What is the distinction between primary and secondary data sources? 

    1. Primary Data – One way to characterize data in geography concerns whether they were collected specifically for the purpose of a researcher’s particular study

      1. An example would be a geographer who interviews people about their attitudes toward bioengineered agriculture

    2. Secondary Data – If, instead, the data have been collected for another purpose, usually by someone other than the researcher

      1. An example of that would be a geographer who uses Landsat imagery to study landslides on the California coast. The imagery was not collected by that researcher, and it was not collected primarily so he or she could study landslides

    What are the five major types of data collection in geography?

    1. Physical measurement 

      1. consist of data collected by recording physical properties of the earth or its inhabitants. Physical properties include size and number, temperature, chemical makeup, moisture content, texture and hardness, the reflectance and transmissivity of electromagnetic energy (including optical light), air speed and pressure, and more

      2. use of aerial and satellite remote sensing as ways to efficiently record large amounts of physical measurement data.

    2. Observation of behavior (Chapter 5)

      1. is the overt and potentially observable actions or activities of individuals or groups of people

      2. It is not their thoughts, feelings, or motivations, although very often behavioral observations provide the data that allow geographers to study thoughts, feelings, and motivations scientifically

    3. Archives (Chapter 5) 

      1. A third type of data collection practiced by geographers is the use of existing records that others have collected primarily for non-research purposes, at least not the geographer’s research

    4. Explicit reports (Chapter 6)

      1. beliefs people express about things—about themselves or other people, about places or events, about activities or objects

      2. Actually, explicit reports are also observations of behavior; answering a question on a survey is behaving, for instance. But we distinguish reports as distinct types of data collection because they always involve explicit recognition by people that researchers are studying them, and because research participants’ explicit beliefs and choices determine the data collected with explicit reports

    5. Computational modeling (Chapter 7)

      1. we defined models as simplified representations of portions of reality

      2. We noted that models can be realized in conceptual, physical, graphical, or computational form

    What are some of the ways geographers and others have made a distinction between quantitative and qualitative methods, and how do they relate to scientific and humanistic approaches in geography?

    1. Quantitative data consist of numerical values, measured on at least an ordinal level but more likely a metric level.

      1. quantitative methods are those that impose a relatively great amount of prior structure on collected data. That is, such methods involve a prior choice of constructs to study, a prior choice of variables with which to measure those constructs, and prior numerical categories with which to express the measured values of those variables

    2. Qualitative data are nonnumerical, or, as in nominal data, numerical values that have no quantitative meaning

      1. They consist of words (in natural language), drawings, photographs, and so on

      2. Qualitative methods, in contrast, involve less prior structure on data collection. Data collection that is very clearly qualitative might start with little more than a topic area or a broad research question. The constructs, variables, and especially the measurement values for the variables are determined as observations are made or even afterward

    Influence Map Design and Representation

    deceptive practices

    all maps are abstractions of reality, maps can subtly or blatantly manipulate the message they impart or contain intentionally false information

    ignorance like in the middle ages with mythical beasts in unknown areas

    propaganda – nazi germany

    maps in soviet russia were distorted for military protection

    1. Knowledge of graphic representation techniques, including thematic mapping, multivariate displays, and web mapping

    A thematic map is a type of map especially designed to show a particular theme connected with a specific geographic area. These maps "can portray physical, social, political, cultural, economic, sociological, agricultural, or any other aspects of a city, state, region, nation, or continent"



    Proportional Symbol

    Isarithmic or Isopleth



    -A dasymetric map is an alternative to a choropleth map. As with a choropleth map, data are collected by enumeration units. But instead of mapping the data so that the region appears uniform, ancillary information is used to model internal distribution of the phenomenon. For example, population density will be much lower in forested area than urbanized area, so in a common operation, land cover data (forest, water, grassland, urbanization) may be used to model the distribution of population reported by census enumeration unit such as a tract or county

    Choropleth maps – These are maps, where areas are shaded according to a prearranged key, each shading or colour type representing a range of values

    Disadvantages of Choropleth Maps

    Although choropleths give a good visual impression of change over space there are certain disadvantages to using them:

    • They give a false impression of abrupt change at the boundaries of shaded units.

    • Choropleths are often not suitable for showing total values. Proportional symbols overlays (included on the choropleth map above) are one solution to this problem.

    • It can be difficult to distinguish between different shades.

    • Variations within map units are hidden, and for this reason smaller units are better than large ones.

    GISP Knowledge - Public
    GISP Knowledge – Public

    Isopleth maps

    Isopleth maps differ from choropleth maps in that the data is not grouped to a pre-defined unit like a city district. These maps can take two forms:

    • Lines of equal value are drawn such that all values on one side are higher than the "isoline" value and all values on the other side are lower, or

    • Ranges of similar value are filled with similar colours or patterns.

    This type of map is ideal for showing gradual change over space and avoids the abrupt changes which boundary lines produce on choropleth maps. Temperature, for example, is a phenomenon that should be mapped using isoplething, since temperature exists at every point (is continuous), yet does not change abruptly at any point (like population density may do as you cross into another census zone). Relief maps should always be in isopleth form for this reason.

    GISP Knowledge - Public
    GISP Knowledge – Public

    Proportional Symbol Maps

    As the name implies, symbols (most often circles) are drawn proportional in size to the size of the variable (e.g. employment change) being represented. Proportional symbol maps are not dependent on the size of the area associated with the variable. In other words, on a proportional symbol map of Europe, tiny Liechtenstein would have the same visual importance as Spain if their unemployment values were the same. This would not be the case with a choropleth map.

    An example of proportional circles is shown on the Czech Republic Voting Register map (above).

    Scaling proportional symbols. Much research has gone into the optimal scaling for proportional symbols. As a general rule, make sure that the area, rather than linear proportions like radius or length of a side, is the scaled parameter. For example, if there are four times as many gentrified businesses in El Raval Site 1 than in Site 3, the area of the symbol should be four times greater for Site 1. If the symbol choice is a circle, the radius of the Site 1 symbol should thus be only twice as great (since area scales with the square of the radius).

    Dot maps

    Used to show the distribution of phenomena where values and location are known. Dot maps create a visual impression of density by placing a dot or some other symbol in the approximate location of the variable being mapped. Dot maps should be used only for raw data, not for prearranged data or percentages. Appropriate themes for dot maps include the distribution of dairy farms, and population distribution in a region.

    Their limitations include the difficulty of counting large numbers of dots in order to get a precise value and the need to have a large amount of initial information before drawing the map.

    Dot map parameters. When constructing a dot map, two parameters must be considered: the graphical size of each dot and the value associated with each dot. For example, you might stipulate that each dot be 2 pixels in diameter, and each represent 100 persons. In general, many small dots, each representing relatively few instances of the attribute, is more effective than a few large dots, but is more tedious to construct.

    Multivariate displays is simply putting lots of data on one map and how to do it, like i don’t get it

    multiple displays

    Web mapping is the process of using maps delivered by geographical information systems (GIS). Since a web map on the World Wide Web is both served and consumed, web mapping is more than just web cartography, it is both a service activity and consumer activity. Web GIS emphasizes geodata processing aspects more involved with design aspects such as data acquisition and server software architecture such as data storage and algorithms, than it does the end-user reports themselves.[1] The terms web GIS and web mapping remain somewhat synonymous. Web GIS uses web maps, and end users who are web mapping are gaining analytical capabilities. The term location-based services refers to web mapping consumer goods and services. Web mapping usually involves a web browser or other user agent capable of client-server interactions.[2]

    While web mapping today is still being developed, challenges and innovations involving the feedback of the quality, the usability, the social benefits, and the legal constraints, drive its evolution

    1. Knowledge of principles of map design, including symbolization, color use, and typography, for a variety of print and digital formats

    Map Layout

    • a title

    • one or more map images, including inset maps

    • a legend or key

    • a visual or narrative explanation of the map scale

    • supporting media, such as photographs, diagrams, and text

    • a north arrow or other depiction of orientation

    • metadata, explaining such information as the currency of the information, sources used, projection, copyrights, and authorship

    reference maps – general information about the location of features

    thematic maps – show the distribution of a specific topic

    All maps are representations

    Features are generalized because they can’t be shown at their true size

    symbols – represent things on a map

    map accuracy – difficult to assess, all maps show a selective view of reality – rather than ask is the map accurate, ask is the map appropriate for my purposes

    Map scale – 1:100 – one inch represents 100 inches in the real world

    Representing scale – scale bar 

    GISP Knowledge - Public
    GISP Knowledge – Public

    large scale – show more detail than a small scale – 1:10000 is larger than 1:25000000

    GISP Knowledge - Public
    GISP Knowledge – Public

    Generalization – intended to remove unnecessary detail – maps cannot show everything

    select which features to show and omit

    Symbolization – assigning symbols to represent features

    geographic dimensions – what geographic features will be on the map

    measurement level – how data is measured – qualitative vs quantitative

    nominal data – differ in type and can’t be ranked (tree species, land uses)

    ordinal data – can be ranked but have relative values (low, medium, high) – can rank them, but can’t tell the difference between them

    interval/ratio data – have numerical values between them (elevation, population)

    Data Processing – know how the data was manipulated, statistics reported as raw values or standardized by some measure

    Visual variables – size, shape, orientation, pattern, hue, value

    size and value for quantitative, shape, pattern, hue for qualitative

    GISP Knowledge - Public
    GISP Knowledge – Public

    typography is the design of text, point size, line length, typefaces

    In terms of cartographic design, typography refers to any text material found on or in relation to a map – titles, labels, legends, etc. It is the art and science behind labeling on maps.

    Text is a crucial part of any quality map. Text simultaneously serves several purposes:

    • It identifies unique features (e.g., "The United Kingdom")

    • It places features within broader categories (e.g., "park")

    • It locates features within a general geographic context (e.g., this vegetation stand is within "Zion National Park")

    • It explains the characteristics and meaning of features on the map (e.g., "high economic potential zone")

    • It prescribes and proscribes action (e.g., "camping not permitted here")

    • It can add to the aesthetic beauty of a map

    • It can give a map a aesthetic feel (e.g., using a typeface that looks modern or historical)

    1. Understanding of how the selection of data classification and/or symbolization techniques affects the message of the thematic map

    Classification – objects with similar symbols

    Up to seven classes – most people can distinguish – try to stick with 5

    Classes should be exhaustive (describe all possible values) and should not overlap (no value can fall into two classes)

    Way to split classes

    Equal range – equal distance between class breaks

    Quantiles – equal number of observations in each class

    Standard deviation – class breaks based on distance of standard deviation from the mean

    Natural breaks – class breaks conform to gaps in data distribution

    GISP Knowledge - Public
    GISP Knowledge – Public

    1. GIS Design Aspects and Data Modeling 

      1. Knowledge of data exchange procedures

    Three data models– the conceptual model, the data structure model, and the transfer model 

    Conceptual ModelThis model describes the spatial objects, as well as the logical and topological relationships between the spatial objects and the captured spatial entities. This general model is object oriented and is also based on existing topological and graph models for spatial data.

    Data Structure Model – This model is used to express the spatial objects of the conceptual model in terms of transfer data structures. The data structures used in this transfer standard are based on the traditional relational and network models. Data structures viewed as spatial data structures are both the traditional vector and raster models

    Transfer Model – This model is used to express the logical constructs of the transfer form in terms of implementation-media constructs. The implementation constructs are made operational by an implementation method. The implementation method selects one or more media and defines the constructs pertaining to those media.

    The transfer model is defined in terms of its constructs and logical relationships. It deals with three types of transfer constructs: (1) logical constructs solely pertaining to this standard, (2) constructs relating to the implementation method, and (3) constructs solely pertaining to the transfer media.

    Data Dictionary/Definition and Data Dictionary/Domain should be included

    a) The specific set of attributes in an attribute module

    b) The relationship between these attributes and an entity

    c) The authorities of the attributes and (or) entity

    d) The format, measurement unit, and maximum length of an attribute

    e) Whether an attribute is a part of a primary or foreign relational key.

    Schema model should be included

    –File based approach: geographic data is encoded in a structured file format, for batch transfer or download

    –Application programming interface (API) approach: geographic data is accessed and exchanged as needed between software systems on the same workstation, often interactively with the user

    –Web services approach: geographic data is accessed and exchanged over networks and the Internet between software components, using http and other web based protocols

    1. Knowledge of security restrictions on data (e.g., user permissions and access rights)

    Data ownership

    -ArcGIS – user who creates tables, feature classes, etc. own those datasets

    User Access – database must verify the user accounts that connect to it – dba has to add users to database

    authentication – database checks the list of users to make sure a user is allowed to make a connection

    2 types of authentication

    1. Operating system (OS) authentication – indicates a user logs in to the computer and the credentials for authorization are supplied to the database by the OS of the user’s computer

      1. only need to log in once

    2. Database authentication – users log in to the server and then must separately log in to the database

    Groups (roles, types or authorities) – grant users based on their common functions

    Public role – any right granted to public is granted to everyone with a db connection

    -this would be similar to the connect role

    Tips for groups:

    1. Create separate groups for system and object privileges

    2. Choose a naming convention that reflects each type of group for easy reference

    3. Grant privileges directly to the gdb administrator and grant privileges via groups for all other users

    4. Avoid mixing roles with directly granted privileges for non-administrator accounts

    1. Knowledge of database administration

    geodatabase admin responsible for gdb system tables, triggers, views, and procedures

    DEFAULT gdb version

    Default schema names – applies to gdb admin as well as nonadmin users who create data

    basic tasks – 

    1. backup and recovery databases

      1. periodically testing a backup and recovery plan

      2. Backups are being done as scheduled

    2. Database security

      1. Prevent hackers

      2. Security models

      3. Three baisc security tasks

        1. Authentication – setting up user accounts to a control logins to the database

        2. Authorization – setting permissions on various parts of the database

        3. Auditing – tracking who did what with the database

          1. can be based on auditing laws as well

    3. Storage and capacity planning

      1. how much disk storage is required and monitoring disk space

      2. Watch growth trends

    4. Performance monitoring and tuning

      1. Monitor database server on a regular basis to identify bottlenecks

      2. Tuning

        1. Capacity of the server hardware and OS configuration can limit

        2. Database is physically laid out on the disk drives

        3. Types of indexing

        4. Queries against the db can change how fast the results are returned

        5. DBA needs to understand which monitoring tools are available

    5. Troubleshooting

      1. DBA needs to quickly ascertain the problem to correct it

    6. Other important tasks

      1. High availability – need to be around all the time

      2. Very Large Databases – Data stored in db has changed

      3. Data extraction, transformation, and loading (ETL) – data must be cleansed before loading

    1. Knowledge of systems architecture and design

    Removing data duplication

    Improving the currency and accuracy of information used in decision-making

    Increasing the reliability of systems

    Decentralize data maintenance

    Improve GIS system availability and stability

    Improve utilization of systems resources

    3 tiers – 

    presentation tier

    application logic tier

    data tier

    The system architecture design process aligns identified business requirements (user needs) derived from business strategy, goals, and drivers (business processes) with identified business information systems infrastructure technology (network and platform) recommendations.

    System design starts with identifying business needs. This includes identifying user locations and required information products, identifying required data resources, and developing appropriate software applications to do the work

    1. System architecture design translates business needs to identified IT requirements. 

    2. Hardware requirements are generated based on peak software processing loads. 

    3. Network connectivity requirements are generated based on peak data flow. 

    4. Capacity Planning tools are provided to automate the design analysis. 

    5. Capacity Planning Tools make the process of aligning Business workflows with selected IT resources agile and iterative in nature, rapidly identifying system performance impacts in response to changing business and technology architecture patterns.

    System planning:

     identifying business needs, defining project requirements, and reducing implementation risk

    • How many users can I support with my existing hardware?

    • What hardware do I need to purchase?

    • How many servers (cores) do I need?

    • What are the software licensing requirements?

    • What workflow loads should I use for my existing applications?

    • What are my current workflow service times?

    • What is the capacity of my current system?

    4 architecture domains overall enterprise business needs

    • The Business Architecture defines the business strategy, governance, organization, and key business processes.

    • The Information Systems Architecture includes a review of the Data and Application architecture.

    > The Data Architecture describes the structure of an organization’s logical and physical data assets and data management resources.

    > The Application Architecture provides a blueprint for the individual applications to be deployed, their interactions, and their relationships to the core business processes of the organization.

    • The Technology Architecture describes the logical software and hardware capabilities that are required to support the deployment of business, data, and application services. This includes IT infrastructure, middleware, networks, communications, processing, standards, etc.

    Implementation Strategy

    Requirements phase

    • User information product needs establish a foundation for completing the design.

    • User location and peak business loads establish a foundation for system architecture design.

    Design phase

    • Infrastructure requirements must be identified to quantify deployment costs.

    • Network communication capacity is an important consideration for GIS deployments.

    • Hardware and software procurement requirements must be identified.

    • Software development and data acquisition needs must be identified.

    Best Practice: Business decisions for project funding and procurement authorization are often required for project effort to proceed beyond this phase.

    Construction phase

    • System procurement authorization, based on the design budget and deployment timeline.

    • Data acquisition and database design efforts begin.

    • Procurement authorization for application design and development.

    • Prototype testing plans completed and scheduled to validate product delivery within design performance targets.

    Implementation phase

    • Initial deployment and operational testing.

    • Final system delivery, user training, and workflow migration complete.

    • System maintenance operations.

    Capacity Planning Tools (CPT) – developed as a framework to promote successful GIS system design and implementation

    System architecture design process

    The enterprise GIS system design process aligns identified business requirements (user needs) derived from business strategy, goals, and drivers (business processes) with recommended business information systems infrastructure technology (network and platform) recommendations.

    1. User needs assessment (results of a GIS user needs assessment provides inputs for the system architecture design analysis)

    2. Workflow loads analysis (translate user needs to project workflows with baseline traffic and processing transaction loads based on estimated workflow complexity)

    3. Technical architecture strategy (identify user locations, network connectivity, and data center server locations)

    4. User requirements analysis (translate peak user workflow loads to peak throughput transaction loads)

    5. Network suitability analysis (translate peak site/network throughput loads to peak site/network traffic and compare with available network bandwidth)

    6. Platform architecture selection (Identify data center platform tier configuration and identify platform selection for each tier)

    7. Software configuration (Identify platform assignment for each workflow software component peak transaction processing load)

    8. Enterprise design solution (combine all peak workflow software component processing loads on the assigned platform tier, translate baseline processing load to selected platform processing load, and generate number of nodes required for each platform tier with estimate of capacity utilization)

    1. Understanding of the enterprise environment

    Enterprise GIS environments include a broad spectrum of technology integration. Most environments today include a variety of hardware vendor technologies including database servers, storage area networks, Windows Terminal Servers, Web servers, map servers, and desktop clients,—all connected by a broad range of local area networks, wide area networks, and Internet communications. All these technologies must function together properly to support a balanced computing environment.

    Centralized computing solutions with a single database environment are the easiest environments to implement and support. Distributed computer systems with multiple distributed database environments can be very complex and difficult to deploy and support. Many organizations are consolidating their data resources and application processing environments to reduce implementation risk and improve administrative support for enterprise business environments

    Enterprise Vision:

    GISP Knowledge - Public
    GISP Knowledge – Public

    GIS software deployment patterns are optimized to support your business needs:

    • Asset management

    • Planning and analysis

    • Field mobility

    • Operational awareness

    • Constituent engagement

    1. Knowledge of schemas and domains and how they interact


    See Also: data dictionary

    1. [computing] The structure or design of a database or database object, such as a table, view, index, stored procedure, or trigger. In a relational database, the schema defines the tables, the fields in each table, the relationships between fields and tables, and the grouping of objects within the database. Schemas are generally documented in a data dictionary. A database schema provides a logical classification of database objects.

    2. [computing] A set of rules, stored in a file, that describe the structure of an XML document. The number, type, and order of elements allowed in an XML document are described in the schema. An XML parser can compare XML documents against the schema. An XML document that uses open and close tags properly is said to be well formed; if it also follows the rules of its designated schema, it is said to be valid.

    data dictionary

    See Also : metadata

    1. [data management] A catalog or table containing information about the datasets stored in a database. In a GIS, a data dictionary might contain the full names of attributes, meanings of codes, scale of source data, accuracy of locations, and map projections used.


    1. [data transfer] Information that describes the content, quality, condition, origin, and other characteristics of data or other pieces of information. Metadata for spatial data may describe and document its subject matter; how, when, where, and by whom the data was collected; availability and distribution information; its projection, scale, resolution, and accuracy; and its reliability with regard to some standard. Metadata consists of properties and documentation. Properties are derived from the data source (for example, the coordinate system and projection of the data), while documentation is entered by a person (for example, keywords used to describe the data).

    domain [data transfer] The range of valid values for a particular metadata element.

    attribute domain

    1. [data structures] In a geodatabase, a mechanism for enforcing data integrity. Attribute domains define what values are allowed in a field in a feature class or nonspatial attribute table. If the features or nonspatial objects have been grouped into subtypes, different attribute domains can be assigned to each of the subtypes.

    coded value domain

    1. [ESRI software] A type of attribute domain that defines a set of permissible values for an attribute in a geodatabase. A coded value domain consists of a code and its equivalent value. For example, for a road feature class, the numbers 1, 2, and 3 might correspond to three types of road surface: gravel, asphalt, and concrete. Codes are stored in a geodatabase, and corresponding values appear in an attribute table.

    range domain

    1. [data structures] A type of attribute domain that defines the range of permissible values for a numeric attribute. For example, the permissible range of values for a pipe diameter could be between 1 and 32 inches.

    spatial domain

    1. [standards] For a spatial dataset in ArcGIS 9.1 and previous versions, the defined precision and allowable range for x- and y-coordinates and for m- and z-values, if present.

    2. [ESRI software] In ArcGIS Survey Analyst, a constraint that sets the minimum and maximum values for the geometry attributes. The extents of this domain define the precision at which geometry attributes (x, y, z, m, id) can be stored as integers. There is a finite number of integers available in the system, so the x,y spatial domain is analogous to a square grid that always contains the same number of rows and columns.

    1. Knowledge of digital file management

    File creation, edit, management

    back up data

    Used to keep track and organize files

    Hierarchical file system – one that uses directories to organize files into a tree structure

    OS has file management system, but can purchase more sophisticated FMS – backup procedures and stricter file protection

    Individual files – shapefiles, file gdbs, tables/spreadsheets, CAD, rasters

    Databases – direct connection to relational database management systems and big data databases

    Geodatabases – stores GIS in a central location for easy access

    Cloud – store it in the cloud!

    Edit data – allows single-user or multiuser editing

    Take control of big data – visualize multiple different types

    Integrate your enterprise – data stored in big business systems to extend their analytical capabilities

    Data Rules and Relationships – define relationships between datasets and set rules (domains and subtypes)

    Manage metadata – describes content, quality, origin, and other characteristics of data – data about data – FGDC, ISO, INSPIRE, and Dublin Core

    Secures data – flexibility and control over how GIS platform is deployed, maintained, secured, and used


    version creation – child version (new version) created from a parent version (existing version) – identical to parent when first created, but will diverge as changes are made to each version – each dataset in database appears only once but behind the scenes, data is in delta tables (“A” (add) and “D” delete tables) – each version has an owner, description, parent version, associated database state, and level of user access

    user access -Private (only owner can view and edit), protected (only owner can edit but all can view), public (anyone can view and edit) 

    version workflows – simplest is concurrent editors editing DEFAULT version, create a separate version for each editor, another is to create a QA version to QAQC edits from users

    GISP Knowledge - Public
    GISP Knowledge – Public

    states – version references a specific database state – a state is a unit of change that occurs in the database – every edit operation performed in the gdb creates a new db state – edit operation is any set of tasks (additions, deletions, modifications) on features and rows – State ID values apply to any and all changes made in the gdb

    GISP Knowledge - Public
    GISP Knowledge – Public


    DEFAULT version – owned by ArcSDE admin – always exists – root version and ancestor to all other versions – published version of the geodatbase – current

    Version management – versions can be created or deleted – edits are isolated in that version until admin merges changes with another version.

    Schema changes affect all other versions (adding a new field)

    reconcile – edits from an ancestor version (target version) are brought into the version being edited in an edit session (edit version) – ancestor version is any version in direct ancestry of the version being edited

    -reconcile – bring all edited features and rows into the edit version – any conflicts will be taken care of

    Conflicts – when a feature was edited in both the edit version and the target version

    Post – second step when merging edits between two version – post process synchronizes the current edit version with the target version – all edits made in the edit version are saved into the target version – both versions are now identical

    compress – actively edited enterprise geodatabase accumulates state IDs in delta tables and has a complex state tree – negatively affects performance – compressing never removes data but instead it cleans up only unused data

    1. Knowledge of database design

    database design – process of producing a detailed data model of a database

    design process – conceptual schema, logical data model, physical database design

    1. determine the relationships between different data elements

    2. Superimpose a logical structure upon the data on the basis of these relationships

    3. Determine the data to be stored – SME – part of requirements analysis

    conceptual schema – determine where relationships and dependency is within the data – data could be changed in the background

    Logical Data Model – once relationships and dependencies are determined – arrange the data into a logical structure that can be mapped into the storage objects supported by the database management system – each talbe may be a logical object or a relationship joining one or more instances of logical objects

    Physical database design – physical configuration of the database on the storage media – includes detailed specification of data elements, data types, indexing options, and other parameters residing in the DBMS data dictionary – detailed design that includes modules & db hardware & software specs

    Old school esri 11 steps to gdb design – not the best method

    1. Identify the information products that you will create and manage with your GIS.

    2. Identify the key data themes based on your information requirements.

    3. Specify the scale ranges and the spatial representations of each data theme at each scale.

    4. Decompose each representation into one or more geographic datasets.

    5. Define the tabular database structure and behavior for descriptive attributes.

    6. Define the spatial behavior, spatial relationships, and integrity rules for your datasets.

    7. Propose a geodatabase design.

    8. Design editing workflows and map display properties.

    9. Assign responsibilities for building and maintaining each data layer.

    10. Build a working prototype. Review and refine your design.

    11. Document your geodatabase design.

    1. Knowledge of database general structure (e.g., tables and data)

    schema objects (oracle) – tables

    -tables – collection of related data held in structured format within a database, contains fields and rows

    -views – result set of a stored query on the data, which the database users can query just as they would in a persistent database collection object

    –view is not part of the physical schema – virtual table computed or collated dynamically from  data in the database when access to that view is requested

    • Views can represent a subset of the data contained in a table. Consequently, a view can limit the degree of exposure of the underlying tables to the outer world: a given user may have permission to query the view, while denied access to the rest of the base table.

    • Views can join and simplify multiple tables into a single virtual table.

    • Views can act as aggregated tables, where the database engine aggregates data (sum, average, etc.) and presents the calculated results as part of the data.

    • Views can hide the complexity of data. For example, a view could appear as Sales2000 or Sales2001, transparently partitioning the actual underlying table.

    • Views take very little space to store; the database contains only the definition of a view, not a copy of all the data that it presents.

    • Depending on the SQL engine used, views can provide extra security.

    sequences – is an ordered collection of objects in which repetitions are allowed – can be finite or infinite – number of elements is called the length of the sequence

    synonyms – an alias or alternate name for a table, view, sequence or other schema object – easier for users to access database objects

    indexes – data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure – quickly locate data without having to search every row in a db table every time a database table is accessed – indexes can be on one or more columns of a database table

    clusters – 

    database links –

    snapshots – state of a system at a particular point in time – can refer to an actual copy of the state of a system

    procedures – subroutine available to applications that access a relational database system – stored in the database data dictionary – typical uses – (data validation, access control mechanisms)

    database trigger is procedural code that is automatically executed in response to certain events on a particular table or view in a database.

    functions – aka subroutine – In computer programming, a subroutine is a sequence of program instructions that perform a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Subprograms may be defined within programs, or separately in libraries that can be used by multiple programs.

    A software package is a software that has been built from source with one of the available package management systems (PMS)

    non-schema objects – users, roles, contexts, directory objects

    1. Knowledge of geospatial data structure (e.g., topology rules)

    Vector – Points, Lines, Polygons (enclosed area) – feature layer is linked to an attribute table

    Raster – world is represented by an array of gridded cells 

    – can store values that represent categories (vegetation type) – basic grid attribute table has a value (code or some real number representing information about the grid cell) and count field (how many grid cells have that same value)

    -can also store continuous values (elevation)

    -aerial photo is a raster but contains no data – they are the background – not considered “data” structure

    TIN – Triangulated irregular network – represents surfaces – created from contours

    -Advantages – small areas with high precision elevation data – can use multiple data inputs – more efficient storage than DEM or contour lines

    -Disadvantages – accurate TINs require very accurate source data – cost to create high precision elevation is very high and data files are large – TIN production and use is very computer intensive – Raster DEM data are more available

    Tabular Information – attribute Table

    -attribute field types – numeric, text, date, blob

    Topology – features need to be connected using specific rules

    -Network topology – features can be connected in a network – lines and junctions are specified and connected so that water can be traced along a flow

    -Planar topology – specifies topological rules for features (parcel boundaries cannot overlap each other or streets cannot have to intersect)





    Topological relationships – do not change if you imagine a map being on a rubber sheet and you pull and stretch in different directions, the rules are still intact – parcels don’t overlap, streets still intersect

    Vector advantages –

    Vectors just seemed more correcter  Can represent point, line, and area features very accurately.  Far more efficient than raster data in terms of storage.  Preferred when topology is concerned  Support interactive retrieval, which enables map generalization 

    Vector disadvantages –

    Vectors are more complex  Less intuitively understood Overlay of multiple vector map is very computationally intensive  Display and plotting of vectors can be expensive, especially when filling areas

    Raster Advantages

    Rasters are faster…  Easy to understand Good to represent surfaces, i.e. continuous fields  Easy to read and write – A grid maps directly onto a programming computer memory structure called an array  Easy to input and output – A natural for scanned or remotely sensed data – Easy to draw on a screen or print as an image  Analytical operations are easier, e.g., autocorrelation statistics, interpolation, filtering

    Raster disadvantages

    Rasters are bigger  Inefficient for storage – Raster compression techniques might not be efficient when dealing with extremely variable data – Using large cells to reduce data volume causes information loss  Poor at representing points, lines and areas – Points and lines in raster format have to move to a cell center. Lines can become fat  Areas may need separately coded edges  Each cell can be owned by only one feature  Good only at very localized topology, and weak otherwise.  Suffer from the mixed pixel problem.  Must often include redundant or missing data


    Map data collection often tabulates data at significant points  Land surface elevation survey – seeks “high information content” points on the landscape, such as mountain peaks, the bottoms of valleys and depressions, and saddle points and break points in slopes  Assume that between triplets of points the land surface forms a plane  Triplets of points forming irregular triangles are connected to form a network

    TIN advantages

    Triangulated Irregular Networks (TIN) – Advantages  More accurate and use less space than grids  Can be generated from point data faster than grids.  Can describe more complex surfaces than grids, including vertical drops and irregular boundaries  Single points can be easily added, deleted, or moved

    Topology Rules

    1. Understanding of desktop, server, enterprise, and hosted (e.g., cloud) applications available, including their benefits and shortcomings

    Desktop – individual user on a computer, make maps, data analysis, data creation

    Server – bring gis into hands of everyone in organization, allows access to web GIS, control of GIS data on your own infrastructure, control over how GIS platform is deployed, maintained, secured and used

    hosted (cloud) – ability to discover, use, make, and share maps with any device anywhere,  anytime  – access other users maps and data – connect more people outside of organization and share latest maps, data, and ideas

    enterprise? – is this not server?

    An Enterprise GIS is a geographic information system that is integrated through an entire organization so that a large number of users can manage, share, and use spatial data and related information to address a variety of needs, including data creation, modification, visualization, analysis, and dissemination

    the server is a method of achieving an enterprise GIS system – this is bullshit!

    1. Working knowledge of GIS hardware and sofetware capabilities (e.g., application servers, data servers, storage devics, and workstations)

    basically says that hardware and software are of all types – 

    software runs on a wide range of hardware types from centralized computer servers to desktop computers used in stand-alone or networked configurations

    Software may rely on DBMS

    The hardware, software, and communication network(s), collectively referred to here as the system infrastructure for an EGIS, deliver to each end user the specific spatial capabilities and resources needed to support their business functions. A conceptual configuration for the EGIS system infrastructure can be established based on the characteristics of existing system infrastructure; required information products and spatial and nonspatial data resources; essential spatial analysis, display, and reporting functions; needed data management resources; and the anticipated number of end users within the departments. 

    GISP Knowledge - Public
    GISP Knowledge – Public

    1. Knowledge of data models, including vector, raster, grid, TIN, topological, hierarchical, network, and object-oriented 

    see above for vector, raster, grid (is grid the same as Raster), TIN, topological data, network (topology)

    hierarchical databaseA database that stores related information in a tree-like structure, where records can be traced to parent records, which in turn can be traced to a root record.

    network dataset

    1. [ESRI software] A collection of topologically connected network elements (edges, junctions, and turns) that are derived from network sources, typically used to represent a linear network, such as a road or subway system. Each network element is associated with a collection of network attributes. Network datasets are typically used to model undirected flow systems.

    object-oriented database

    1. [database structures] A data management structure that stores data as objects (instances of a class) instead of as rows and tables as in a relational database.


    See Also : ESRI Grid

    1. [cartography] In cartography, any network of parallel and perpendicular lines superimposed on a map and used for reference. These grids are usually referred to by the map projection or coordinate system they represent, such as universal transverse Mercator grid.

      GISP Knowledge - Public
      GISP Knowledge – Public

    2. [data models] See raster.

    ESRI Grid

    1. [ESRI software] An ESRI data format for storing raster data that defines geographic space as an array of equally sized square cells arranged in rows and columns. Each cell stores a numeric value that represents a geographic attribute (such as elevation) for that unit of space. When the grid is drawn as a map, cells are assigned colors according to their numeric values. Each grid cell is referenced by its x,y coordinate location.

    1. GIS Analytical Methods 

      1. Knowledge of overlay analysis

    Vector Overlay Tools

    Identity – Input features, split by overlay features

    Intersect – Only features common to all input layers

    Symmetrical Difference – Features common to either input layer or overlay, layer but not both

    Union – All input features

    Update – Input feature geometry replaced by update layer

    Raster Overlay Tools

    Zonal Statistics – Summarizes values in a raster layer by zones (categories) in another layer—for example, calculate the mean elevation for each vegetation category

    Combine – Assigns a value to each cell in the output layer based on unique combinations of values from several input layers

    Single Output Map Algebra – Lets you combine multiple raster layers using an expression you enter—for example, you can add several ranked layers to create an overall ranking

    Weighted Overlay – Automates the raster overlay process and lets you assign weights to each layer before adding (you can also specify equal influence to create an unweighted overlay)

    Weighted Sum – Overlays several rasters multiplying each by their given weight and summing them together.


    1. [analysis/geoprocessing] A spatial operation in which two or more maps or layers registered to a common coordinate system are superimposed, either digitally or on a transparent material, for the purpose of showing the relationships between features that occupy the same geographic space.

    2. [analysis/geoprocessing] In geoprocessing, the geometric intersection of multiple datasets to combine, erase, modify, or update features in a new output dataset.

    spatial overlay

    1. [analysis/geoprocessing] The process of superimposing layers of geographic data that cover the same area to study the relationships between them.

    The following lists the general steps to perform overlay analysis:

    1. Define the problem.

      1. Clear definition of each component and how they interact

    2. Break the problem into submodels.

      1. certain attributes can be in multiple sub models

    3. Determine significant layers.

      1. layers and attributes that affect each submodel need to be identified

      2. Some of these leaders may need to be created

    4. Reclassify or transform the data within a layer.

    • Ratio—The ratio scale has a reference point, usually zero, and the numbers within the scale are comparable. For example, elevation values are ratio numbers, and an elevation of 50 meters is half as high as 100 meters.

    • Interval—The values in an interval scale are relative to one another; however, there is not a common reference point. For example, a pH scale is of type interval, where the higher the value is above the neutral value of 7, the more alkaline it is, and the lower the value is below 7, the more acidic it is. However, the values are not fully comparable. For example, a pH of 2 is not twice as acidic as a pH of 4.

    • Ordinal—An ordinal scale establishes order such as who came in first, second, and third in a race. Order is established, but the assigned order values cannot be directly compared. For example, the person who came in first was not necessarily twice as fast as the person who came in second.

    • Nominal—There is no relationship between the assigned values in the nominal scale. For example, land-use values, which are nominal values, cannot be compared to one another. A land use of 8 is probably not twice as much as a land use of 4.

    1. Weight the input layers.

      1. factors can be weighted based on their importance

    2. Add or combine the layers.

      1. establish the relationship of all the input factors together to identify the desireable locations

      2. fuzzy logic overlay analysis – 

    3. Analyze.

      1. analyze the results – do the potential results answer the question

    Fuzzy Membership – The Fuzzy Membership tool reclassifies or transforms the input data to a 0 to 1 scale based on the possibility of being a member of a specified set. 0 is assigned to those locations that are definitely not a member of the specified set, 1 is assigned to those values that are definitely a member of the specified set, and the entire range of possibilities between 0 and 1 are assigned to some level of possible membership (the larger the number, the greater the possibility).

    The Fuzzy Gaussian function transforms the original values into a normal distribution. The midpoint of the normal distribution defines the ideal definition for the set, assigned a 1, with the remaining input values decreasing in membership as they move away from the midpoint in both the positive and negative directions. The input values decrease in membership from the midpoint until they reach a point where the values move too far from the ideal definition and are definitely not in the set and are therefore assigned zeros.

    GISP Knowledge - Public
    GISP Knowledge – Public

    The Fuzzy Large transformation function is used when the larger input values are more likely to be a member of the set. The defined midpoint identifies the crossover point (assigned a membership of 0.5) with values greater than the midpoint having a higher possibility of being a member of the set and values below the midpoint having a decreasing membership. The spread parameter defines the shape and character of the transition zone.

    GISP Knowledge - Public
    GISP Knowledge – Public

    The Fuzzy Linear transformation function applies a linear function between the user-specified minimum and maximum values. Anything below the minimum will be assigned a 0 (definitely not a member) and anything above the maximum a 1 (definitely a member). The blue line in the image below represents a positive sloped linear transformation with a minimum of 30 and a maximum of 80. Any value below 30 will be assigned a zero and anything above 80 a 1.

    GISP Knowledge - Public
    GISP Knowledge – Public

    The Fuzzy MS Large transformation function is similar to the Fuzzy Large function, except the definition of the function is based on a specified mean and standard deviation. Generally, the difference between the two functions is that the Fuzzy MS Large function can be more applicable if the very large values are more likely to be a member of the set.

    GISP Knowledge - Public
    GISP Knowledge – Public

    The Fuzzy MS Small transformation function is similar to the Fuzzy Small function, except the definition of the function is based on a specified mean and standard deviation. Generally, the difference between the two functions is that the Fuzzy MS Small function can be more applicable if the very small values are more likely to be a member of the set.

    The Fuzzy Near transformation function is most useful if membership is near a specific value. The function is defined by a midpoint defining the center of the set, identifying definite membership and therefore assigned a 1. As values move from the midpoint, in both the positive and negative directions, membership decreases until it reaches 0, defining no membership. The spread defines the width and character of the transition zone.

    GISP Knowledge - Public
    GISP Knowledge – Public

    The Fuzzy Small transformation function is used when the smaller input values are more likely to be a member of the set. The defined midpoint identifies the crossover point (assigned a membership of 0.5) with values greater than the midpoint having a lower possibility of being a member of the set and values below the midpoint having a higher possibility of membership. The spread parameter defines the shape and character of the transition zone

    GISP Knowledge - Public
    GISP Knowledge – Public

    Fuzzy Overlay

    The Fuzzy Overlay tool allows the analysis of the possibility of a phenomenon belonging to multiple sets in a multicriteria overlay analysis. Not only does Fuzzy Overlay determine what sets the phenomenon is possibly a member of, it also analyzes the relationships between the membership of the multiple sets.

    The Fuzzy And overlay type will return the minimum value of the sets the cell location belongs to.

    The Fuzzy Or overlay type will return the maximum value of the sets the cell location belongs to

    The Fuzzy Product overlay type will, for each cell, multiply each of the fuzzy values for all the input criteria

    The Fuzzy Gamma type is an algebraic product of Fuzzy Product and Fuzzy Sum, which are both raised to the power of gamma.

    1. Functional knowledge of planar geometry (e.g., points, lines, and polygons) required to convert real world examples into spatial concepts

    In mathematics, a plane is a flat, two-dimensional surface. A plane is the two-dimensional analogue of a point (zero dimensions), a line (one dimension) and three-dimensional space. Planes can arise as subspaces of some higher-dimensional space, as with the walls of a room, or they may enjoy an independent existence in their own right, as in the setting of Euclidean geometry.

    I think this question will ask “you have a river, what is the best geometry representation of a river”

    1. Knowledge of algebra (e.g., deriving values from a basic formula) 

    just go nuts:

    not 100% sure what they may ask

    1. Knowledge of statistics (e.g., descriptives, summary statistics, and R-squared)

    Descriptive statistics is the discipline of quantitatively describing the main features of a collection of information,[1] or the quantitative description itself.

    -Summarize a sample rather than use the data to learn about the population that the sample of data is thought to represent.

    GISP Knowledge - Public
    GISP Knowledge – Public

    In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate the largest amount of information as simply as possible

    In statistics, the coefficient of determination, denoted R2 or r2 and pronounced R squared, is a number that indicates how well data fit a statistical model – sometimes simply a line or a curve. An R2 of 1 indicates that the regression line perfectly fits the data, while an R2 of 0 indicates that the line does not fit the data at all. This latter can be because the data is utterly non-linear, or because it is random.

    Spatial Statistics – Conceptual Models – 

    Inverse distance (spatial autocorrelation) – all features influence all other features, but the closer something is, the more influence it has 

    Distance band – features outside a specified distance do not influence the features within the area

    Zone of indifference – combines inverse distance and distance band

    K Nearest Neighbors – a specified number of neighboring features are included in calculations

    Polygon Contiguity – polygons that share an edge or node influence each other

    Spatial weights – specified by user (ex. Travel times or distances)

    Mean Center ?Average x and y-coordinates for all features ?Useful for comparing distributions of different features or over time

    Central feature ?Feature having the shortest total distance to all other features ?Useful for finding the most accessible feature


    Standard distance – the extent to which the distance between the mean center and the features vary from the average distance 

    Orientation – Linear directional mean identifies general mean direction of a set of lines

    Standard deviational ellipse is useful for comparing distributions of features and comparing one type of feature at different times


    Useful to:  Better understand geographic phenomena (ex. Habitats) 

    Monitor conditions (ex. Level of clustering)  

    Compare different sets of features (ex. Patterns of different types of crimes)  

    Track change

    Average Nearest Neighbor – Measures how similar the actual mean distance between locations is to the expected mean distance for a random distribution

    Ripley’s K-function – GIS counts the number of neighboring features within a given distance to each feature based on location. The test compares the observed K value at each distance to expected K value for a random distribution.

    Global statistics – identify and measure the pattern of the entire study area ? Do not indicate where specific patterns occur .

    Local Statistics – identify variation across the study area, focusing on individual features and their relationships to nearby features (i.e. specific areas of clustering)

    Spatial Autocorrelation (Moran’s I)

    ?Measures whether the pattern of feature values is clustered, dispersed, or random.

    ?Global Statistic

    ?Calculates I values to test for statistically significant clustering



    I<0= Values




    Anselin Local Moran’s I 

    Local statistic 

    ?Measures the strength of patterns for each specific feature.

    Positive I value: ? Feature is surrounded by features with similar values, either high or low. ? Feature is part of a cluster. ? Statistically significant clusters can consist of high values (HH) or low values (LL)  Negative I value: ? Feature is surrounded by features with dissimilar values. ? Feature is an outlier. ? Statistically significant outliers can be a feature with a high value surrounded by features with low values (HL) or a feature with a low value surrounded by features with high values (LH).

    Getis-Ord General G

    ?Global statistic that indicates whether similar values (either high or low) are clustered. 

    ?Works best when either high or low values are clustered (but not both). 

    ?Value of G score indicates statistically significant relationships

    Hot Spot Analysis (Getis-Ord Gi*) 

    ?Local version of the G statistic that indicates hot (cluster of high values) or cold spots (clusters of low values) 

    ?To be statistically significant, the hot spot or cold spot will be surrounded by features with similar values, but have significantly higher/lower values than its neighbors.

    ?G=high value=hot spots ?G=low value=cold spots

    Regression Analysis

    With Regression Analyses, you ask WHY something is happening.

    Model, examine and explore spatial relationships  


    Linear Regression  

    Used to analyze linear relationships among variables.  

    Linear relationships are positive or negative  

    Regression analyses attempt to demonstrate the degree to which one or more variables potentially promote positive or negative change in another variable.

    Linear Regression Techniques  

    Ordinary Least Squares (OLS) is the best known technique and a good starting point for all spatial regression analyses. 

    ? Global model = provides 1 equation to represent the entire dataset  

    Geographically Weighted Regression (GWR) 

    ? Local Model = fits a regression equation to every feature in the dataset 

    ? Regional variation incorporated into the regression model

    1. Knowledge of basic programming (e.g., scripting, object oriented, query, and extensible)

    Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which are data structures that contain data, in the form of fields, often known as attributes; and code, in the form of procedures, often known as methods. A distinguishing feature of objects is that an object's procedures can access and often modify the data fields of the object with which they are associated (objects have a notion of "this" or "self"). In OO programming, computer programs are designed by making them out of objects that interact with one another.[1][2] There is significant diversity in object-oriented programming, but most popular languages are class-based, meaning that objects are instances of classes, which typically also determines their type.

    In software engineering, extensibility (not to be confused with forward compatibility) is a system design principle where the implementation takes future growth into consideration. It is a systemic measure of the ability to extend a system and the level of effort required to implement the extension. Extensions can be through the addition of new functionality or through modification of existing functionality. The central theme is to provide for change – typically enhancements – while minimizing impact to existing system functions.

    1. Knowledge of raster/vector principles 


    1. [data models] A coordinate-based data model that represents geographic features as points, lines, and polygons. Each point feature is represented as a single coordinate pair, while line and polygon features are represented as ordered lists of vertices. Attributes are associated with each vector feature, as opposed to a raster data model, which associates attributes with grid cells.

    GISP Knowledge - Public
    GISP Knowledge – Public

    1. [graphics (computing)] Any quantity that has both magnitude and direction.

    vector data model

    1. [data models] A representation of the world using points, lines, and polygons. Vector models are useful for storing data that has discrete boundaries, such as country borders, land parcels, and streets.


    See Also: lattice, vector

    1. [data models] A spatial data model that defines space as an array of equally sized cells arranged in rows and columns, and composed of single or multiple bands. Each cell contains an attribute value and location coordinates. Unlike a vector structure, which stores coordinates explicitly, raster coordinates are contained in the ordering of the matrix. Groups of cells that share the same value represent the same type of geographic feature.

    GISP Knowledge - Public
    GISP Knowledge – Public

    1. [ESRI software] In ArcGIS, an in-memory representation of a raster dataset. A raster may exist in memory as a subset of a raster dataset; it may have a different cell size than the raster dataset; or it may exist using a different transformation than the raster dataset.

    raster data model

    See Also: vector data model

    1. [data models] A representation of the world as a surface divided into a regular grid of cells. Raster models are useful for storing data that varies continuously, as in an aerial photograph, a satellite image, a surface of chemical concentrations, or an elevation surface.

    1. Knowledge of scales (e.g., visual, verbal, relative, absolute, physical, and display vs. data)

    verbal scale – expresses in words a relationship between a map distance and ground distance

    One inch represents 16 miles.

    Visual scale – graphic scale or bar scale

    representative scale – representative fraction or ratio scale – 1:24,000 – 1” = 24000”

    Size of Scale

    Representative Franction (RF)

    Large Scale

    1:25,000 or larger

    Medium Scale

    1:1,000,000 to 1:25,000

    Small Scale

    1:1,000,000 or smaller

    (a) Convert verbal scale of "1" to 18 miles" to RF

    GISP Knowledge - Public
    GISP Knowledge – Public

    An absolute scale is a system of measurement that begins at a minimum, or zero point, and progresses in only one direction. An absolute scale differs from an arbitrary, or "relative," scale, which begins at some point selected by a person and can progress in both directions. An absolute scale begins at a natural minimum, leaving only one direction in which to progress. This natural minimum must be an intrinsic property of the measured dimension rather than a natural side-effect of its progression (i.e.: Water freezes and boils naturally at certain temperatures, but these are not natural minimums or maximums of temperature.)

    physical scale?

    display vs data?

    abstract data – what we draw but isn’t there (political boundaries)

    physical – land masses and bodies of water

    abstracted from their true physical appearance and simplified in a way that allows me to see only what’s useful

    1. Knowledge of units of measurement (e.g., conversion and angular vs. metric) 

    1 mi = 5280 ft

    1 ft = .3048 m

    1 mi = 1.6093 km

    1 int nautical mile = 2025.4 yd = 6076.12


    90° in a right angle, 60 minutes of arc in one degree, 60 seconds of arc in a minute

    Radians – 360° is a whole circle – 2pi x radius is the circle

    Bearings – angle less than 90° within a quadrant defined by the cardinal directions

    Azimuth – angle between 0° and 360° measured clockwise from north

    GISP Knowledge - Public
    GISP Knowledge – Public

    not sure what angular vs metric is

    1. Data Manipulation

      1. Knowledge of selection queries (e.g., attribute, spatial, and location)


    New Selection, Add to Selection, Remove from Selection, Subset Selection, Switch Selection, Clear Selection



    Within a Distance

    Contains – features contain an input polygon

    Completely Contains – features must completely contained a input polygon

    Contains Clementini – feature must completely contain the input polygon but if it is entirely on the boundary, it will not be selected. no part can be on the inside or outside

    Within – input layer will be selected if they are within a selecting feature – selecting feature must be polygons

    Completely Within – features in input must be completely within the selecting features (polygons)

    Within Clementini – features in input must be completely within the selecting features and cannot be entirely on the boundary of the features

    Are Identical To – features are identical to input layer

    Boundary Touches – features in the input layer will be selected if they have a boundary that touches a selecting feature – must be lines or polygons – must be completely inside or outside the polygon

    Share a Line Segment With – features in the input layer will be selected if they share a line segment

    Crossed by the Outline of – input features will be selected if they are crossed by the outline of a selecting feature – must be lines or polygons

    Have their Center In – features will be selected if their center falls within a selecting feature

    Contained By – same as Within

    difference between spatial and location?!?!

    1. Knowledge of different data types (e.g., SHP, GDB, Coverage, DGN, TXT, and IMG) and formats (spatial, rendered, and tabular)

    SHP – shapefile

    .shp – shape format – feature geometry itself

    .shx – shape index format – positional index of the feature geometry to allow seeking forwards and backwards quickly

    .dbf – attribute information

    .prj – projection format

    .sbn & .sbx – spatial index

    .shp.xml – geospatial metadata in XML format

    GDB – geodatabase

    .gdb – file geodatabase

    .mdb – personal geodatabase based on microsoft access

    coverage file

    coverage feature class

    1. [ESRI software] In ArcInfo, a classification describing the format of geographic features and supporting data in a coverage. Feature classes include point, arc, node, route, route system, section, polygon, and region. One or more coverage features are used to model geographic features; for example, arcs and nodes can be used to model linear features, such as street centerlines. The tic, annotation, link, and boundary feature classes provide supporting data for coverage data management and viewing.

    DGN – AutoCAD and MicroStation

    Txt – Text

    IMG – Image

    LiDAR – remote sensing technology that measures distance by illuminating a target with a laser and analyzing the reflected light

    Raster – .jpg, .tif, .gif

    is this just knowing the extensions? and knowing what’s in it?

    1. Knowledge of different field types

    Short integer – between -32768 and 32768

    Long integer – between -2147483648 and 2147483647

    Float (single-precision floating-point numbers)

    Double (double-precision floating-point numbers)

    Text – could be a coded value – assign to an integer through a domain


    BLOBs – data stored as a long sequence of binary numbers – ArcGIS stores annotation and dimensions as BLOBs – images, multimedia, bits of code

    Object Identifiers – Unique IDs and FIDs

    Global Identifiers – Global ID and GUID – data types store registry style strings consisting of 36 characters enclosed in curly brackets

    Raster field types – raster can be stored within the geodatabase

    Geometry – point, line, polygon, multipoint, multipatch

    1. Knowledge of data relationships (e.g., one to one and many to many)

    1-1 relationship – each object of the origin table/feature class can be related to zero or one object of the destination table/feature class

    1-Many relationship – each object in the origin table/feature class can be related to the multiple objects in the destination table/feature class

    Many-Many relationship – multiple objects of the origin table/feature class can be related to multiple objects of the destination table/feature class

    1. Knowledge of data collection, transfer, and format conversion (e.g., export formats, properties, and settings)

    Data collection:

    Primary data sources are those collected in digital format specifically for use in a GIS project

    Secondary sources are digital and analog datasets that were originally captured for another purpose and need to be converted into a suitable digital format for use in a GIS project.

    Data collection Workflow:

    Planning includes establishing user requirements, garnering resources, and developing a project plan. 

    Preparation involves obtaining data, redrafting poor-quality map sources, editing scanned map images, removing noise, setting up appropriate GIS hardware and software systems to accept data.

    Digitizing and transfer are the stages where the majority of the effort will be expended.  Editing and improvement covers many techniques designed to validate data, as well as correct errors and improve quality.  

    Evaluation is the process of identifying project successes and failures. 

    Primary Data:

    Remote Sensing

    -3 types of Resolution – key physical characteristic of remote sensing systems

    -Spatial Resolution – size of object that can be resolved and the most usual measure is the pixel size

    -Spectral resolution – parts of the electromagnetic spectrum that are measured

    -Temporal resolution – repeat cycle – frequency with which images are collected for the same area

    Surveying – Ground surveying based on the principle that the 3-D location of any point can be determined by measuring angles and distances from other known points

    Ground survey – time consuming and expensive activity

    -Used to capture buildings, land, and property boundaries, manholes and other objects

    GPS is another method

    LiDAR – scanning laser rangefinder to produce accurate topographic surveys


    -Scan hardcopy maps, film, paper maps, aerial photographs, images

    -Map, aerial photographs and images are scanned prior to vectorization

    Vector data capture – digitizing vector objects from maps and other geographic data sources

    heads-up digitizing and vectorization – process of converting raster data into vector data

    Digitize vector objects using a mouse or digitizing cursor

    Measurement error – human errors during digitizing – overshoots, undershoots, invalid polygons, sliver polygons

    -Rubbersheeting – assumes that spatial autocorrelation exists among errors

    Photogrammetry – science and technology of making measurements from pictures, aerial photographs, and images

    measurements are captures from overlapping pairs of photographs using stereo plotters

    Orientation  – process of creating a stereo model suitable for viewing and extracting 3D vector coordinates that describe geographic objects

    Triangulation – used to assemble a collection of images into a single model so that accurate and consistent information can be obtained from large areas

    Orthoimages – images corrected for variations in terrain using a DEM

    COGO data entry – COGO – coordinate geometry – methodology for capturing and representing geographic data

    COGO – uses survey-style bearings and distances to define each part of an object

    COGO – very precise measurements and are often regarded as the only legally acceptable definition of land parcels

    Syntactic translation – converting specific digital symbols (letters and numbers) between systems

    Semantic translation – converting the meaning inherent in geographic information

    Attribute data – entered by direct data loggers, manual keyboard entry, optical character recognition, voice recognition

    Data collection – expensive

    Types of collection – data capture or data transfer

    Two capture methods – primary (direct measurement) and secondary (indirect derivation)

    GPS –

    GPS – 24 satellites – orbit earth twice a day – revolution every 12 hours – altitude of about 12,000 miles – started by us department of defense in the 1970’s for military 

    Space segment – NAVigation Satellite Timing and Ranging (NAVSTAR) constellation – GPS satellites which transmit signals on two phase modulated frequencies – transmit a navigation message that contains orbital data for computing the positions of all satellites

    Ground segment – called the control segment – Master Control Station – near Colorado Springs Colorado – monitoring locations around the world – purpose of control segment is to monitor satellite transmissions continuously to predict the satellite ephemeris, to calibrate satellite clocks and update the navigation message periodically

    User segment stands for the total GPS user community – user will typically observer and record the transmissions for several satellites and apply solution algorithms to obtain position, velocity, and time

    Standard Positioning Service – signal broadcast for civilian use

    Horizontal location – 3 satellites are required

    Vertical position – min 4 satellites are required

    Calculate distance by measuring the time interval between the transmission and reception of a satellite signal

    Trilateration – used to determine position of the GPS receiver

    -Accuracy dependent on type of GPS receiver, field techniques, post processing of data, error from various sources

    3 Types of GPS receivers

    -Recreational Grade – accuracy within 5 to 20 meters, no ability to post process data, can do real time correction using Wide Area Augmentation System (SAAS) – can be used to navigate to a specific area – compile uncorrected GPS data 

    -Mapping Grade – accuracy from sub meter to 5 meters – GPS receivers can log raw GPS data – enabling data to be post-processed using GPS software – higher level of precision – GPS receiver can communicate with a base station – store attributes of features, use a data dictionary and upload data from the GPS device to a PC

    -Survey or High Accuracy Grade – instruments with associated software that can achieve one centimeter relative accuracy – land surveyors for boundary, topographic, and geodetic surveys, photogrammetry and other activities requiring high accuracy

    GPS errors

    -Multipath – errors caused by reflected GPS signals arriving at the GPS receiver – nearby structures and other reflective surfaces

    -Atmosphere – GPS signals can experience delays when traveling through the atmosphere – Common atmospheric conditions can affect GPS signals such as tropospheric delays and ionospheric delays

    -Distance from Base Station – differential correction will increase the quality of the data, accuracy is degraded slightly as the distance from the base station increases

    -Selective Availability – intentional degradation of the GPS signals by the department of defense (DOD) to limit accuracy for non-U.S. military and government users – currently turned off, but can turn it back on whenever

    Noise – error is the distortion of the satellite signal prior to reaching the GPS receiver and or additional signal piggy backing onto the GPS satellite signal

    Before collecting – Planning

    -Satellite availability and known outages – be sure that satellites will be available – United States Coast Guard maintains a website that generates a digest of known forecasted GPS satellite outages – digest called Notice Advisory to NAVSTAR Users (NANU)

    -PDOP – Position Dilution of Precision – collect data when there is an optimum satellite availability (four or more) and when satellites are in an appropriate configuration to produce an acceptable (lower) PDOP value – higher PDOP values are bad – 

    -Local Obstructions of the Sky – be aware of local obstructions such as a canyon, forest canopy, etc.

    -GPS data dictionary design – designed for specific project to make project efficient based on information being collected

    Set before going to the field

    PDOP values – set to 6 or less. Higher levels will be less reliable data

    Signal to Noise Ratio (SNR) mask – set the value of the SNR mask higher to help minimize noise error – user manufacturer recommendations

    Elevation Mask – set it to 15 degrees – default angle to minimize the amount of atmosphere through which the satellite signal has to travel

    Data Collection Rate (sync rate) – recommended to collect point data at 1-second interval – collect polygon and line data at a 5 second interval – collect point data at the same data collection interval as the base station

    Datum – GPS receivers are designed to collect GPS positions relative to the WGS84 datum – can designate what datum to be used

    Projection – Make sure projection is correct

    Unit of Measure – be aware of the units of measure with each projection

    UTM – is in meters

    State Plane is in US survey Feet or meters

    GPS coordinates

    Latitude/Longitude – Degrees/Minutes/Seconds (DMS) 43o 5’ 20”

    Latitude/Longitude – Decimal Degrees (DD) 43.088889o

    Latitude/Longitude – Degrees and decimal minutes – 43o 5.33333’

    UTM 18 – (4740283N, 434057E)

    State Plane – US feet – (312608N, 313525E)

    US National Grid – (18T WN 7125315437)

    QC – use high resolution orthophotos to see if there are gross errors

    Data Collection

    GPS Receiver Antenna – orient the GPS antenna skyward – and not block antenna with their hands and body and head

    Prohibit Editing the Data Dictionary

    Data Download – Download data as soon as possible to minimize risk of losing the data

    Post-Processing – As soon as data downloaded

    -Rapid Identification of reference stations that are out of service

    -Avoidance of encountering a condition where reference stations have been deleted

    -Compliance with a standardized workflow procedure

    Base station being used – recommended to only use NOAA/NGS base stations – advanced users can establish their own base station

    Metadata – According to the FGDC – Federal Geographic Data Committee – Maintains the value of the data set over time, preserves the data description, allows users to search for and use existing geospatial data and contributed to an NSDI clearinghouse


    Spatial Data Transfer Standard (SDTS)  SDTS is “a robust way of transferring earth-referenced spatial data between dissimilar computer systems with the potential for no information loss. It is a transfer standard that embraces the philosophy of selfcontained transfers, i.e. spatial data, attribute, georeferencing, data quality report, data dictionary, and other supporting metadata all included in the transfer” (USGS,  Draft standard published in The American Cartographer (1988)  FIPS (Federal Information Processing Standards) 173 approved 1992  Standard consists of several parts

    The American National Standards Institute’s (ANSI) Spatial Data Transfer Standard (SDTS) is a mechanism for archiving and transferring of spatial data (including metadata) between dissimilar computer systems. The SDTS specifies exchange constructs, such as format, structure, and content, for spatially referenced vector and raster (including gridded) data. The SDTS includes a flexible conceptual model, specifications for a quality report, transfer module specifications, data dictionary specifications, and definitions of spatial features and attributes.

    The U.S. Geological Survey (USGS) remains the designated maintenance authority for the base standard and SDTS Parts 4 (TVP) and 5 (RPE). Maintenance of other profiles will be conducted by the sponsoring organization(s)

    data transfer

    1. [data transfer] The process of moving data from one system to another or from one point on a network to another.

    GML – open, vendor-neutral eXtensible Markup Language (XML) encoding for transport and storage of geographic information – 

    Format Conversion –

    Vector Formats

    Hardware Specific Formats

    There are two types of formats, those that preserve and use the actual ground coordinates of the data and those that use alternative page coordinate description of the map. Page Coordinates are used when a map is being drafted for display in a computer mapping program or in the data display module.  In the late 1970s, programs came out that were device independent.

    The Hewlett-Packard Graphics Language (HPGL) is a page description language designed for use with plotters and printers. Each line of the file contains one move command, so a line segment connects two successive lines or points. It is unstructured and does not store or use topology.


    PostScript is a page definition language that is usually used to export or print a map rather than data. It supports graphics in both vector and raster formats. Postscript is used commonly by Adobe, and most printers are able to read it.

    Digital Exchange Format (DXF)

    DXF is an external format for transferring files between computers or between software packages. It is produced by Autocad. It does not have topology, but offers good detail on drawings, line widths and styles, colors, and text. DXF is typically constructed in 64 layers. Each layer consists of different features; allowing the user to separate features.

    Omaha Public Power District uses this kind of software. It is a turn-key system with street and power line layers. The problem is that you can not tell what street the power line is on or closest to because it lacks topology and spatial analysis.

    Digital Line Graph (DLG)

    DLGs are distributed by the government, and are available at 1:100,000 and 1:24,000 scales. Features are in separate files that most GIS packages will import, although extra data manipulation is often necessary. DLGs consist of line work with the contours removed, therefore elevation is not available.


    TIGER format was first distributed by the US. Census Bureau in 1990. It includes block level maps of every village, town, and city in the United States. It includes geocoded block faces with address ranges of street numbers. This means than that they include topology and can address match. The maps are a combination of DLG and DBF/DIME files. They used the 1980 Census Bureau's maps along with the USGS's DLG maps, thus combining urban and nonurban areas.

    TIGER consist of an arc/node type arrangement with separate files for points (zero cells), lines (one cells) and areas (two cells) that are linked together by cross-indexing. Cross-indexing means some features can be encoded as landmarks that allow GIS layers to be tied together.


    A shapefile is a vector data format for storing the location, shape, and attributes of geographic features. A shapefile is stored in a set of related files and contains one feature class.

    Scalable Vector Graphics

    An SVG is an image that is an extension of the XML language. Any program that recognizes XML can display the SVG image. The scalable part of the term emphasizes that you can zoom- in on an image and not lose resolution. SVG files also have the advantages of being smaller, and arriving faster, than conventional image files such as GIF, PDF, and JPEG.

    Arc-Info Coverage

    This is a data model for storing geographic features using ArcInfo software. A coverage stores a set of thematically associated data considered to be a unit. It usually represents a single layer, such as soils, streams, roads, or land use. In a coverage, features are stored as both primary features (points, arcs, polygons) and secondary features (tics, links, annotation). Feature attributes are described and stored independently in feature attribute tables. Coverages cannot be edited in ArcGIS.

    Arc-Info Interchange File (.e00)

    An ArcInfo interchange file, also known as an export file, is a file format used to enable a coverage, grid or TIN and an associated INFO table to be transferred between different machines. ArcInfo interchange files have a .e00 extension, which increments to .e01, .e02, and so on, if the interchange file is composed of several separate files.


    A geo-database is an object-oriented data model that represents geographic features and attributes as objects and the relationships between objects but is hosted inside a relational database management system. A geodatabase can store objects, such as feature classes, feature data sets, nonspatial tables, and relationship classes.

    Raster Formats

    Standard Raster Format

    Many of the formats are based on photographic formats. The file structure has a header with a fixed length and a keyword or "magic number" to identify the format. In the header the length of one record in bits and the number of rows and columns can be found. Often the header also has a color table. This explains what colors to project.

    Tagged Image File Formats (TIFF)

    This format is associated with scanners. It saves the scanned images and reads them. TIFF can use run length and other image compression schemes. It is not limited to 256 colors like a GIF.


    As part of a header in a TIFF format it puts Lat/Long at the edges of the pixels.

    Graphic Interchange Format (GIF)

    Graphic Interchange Format. A file format for image files, commonly used on the Internet. It is well-suited for images with sharp edges and relatively few gradations of color.

    Joint Photograph Experts Group (JPEG)

    JPEG is a common picture format. It uses a variable-resolution compression system offering both partial and full resolution recovery.


    Digital Elevation Models or DEM have two types of displays. The first is 30-meter elevation data from 1:24,000 seven-and-a-half minute quadrangle map. The second is the 1:250,000 3 arc-second digital terrain data. DEMs are produced by the National Mapping Division of USGS.

    Band Interleaved by Pixel (BIP), Band Interleaved by Line (BIL)

    BIP and BIL are formats produced by remote sensing systems. The primary difference among them is the technique used to store brightness values captured simultaneously in each of several colors or spectral bands.

    RS Landsat

    Landsat satellite imagery and BIL information are used in RS Landsat.  In one format, using BIL, pixel values from each band are pulled out and combined. Programs that use this kind of information include IDRISI, GRASS, and MapFactory. It is fairly easy to exchange information from within these raster formats.

    Data Conversion

    Raster-to-Raster & Vector-to-Vector

    There are many types of vector formats used in GIS, and even more raster formats. It is often necessary to change between file formats, even if they are both raster, or both vector, to make data sets useable together. There are many free, and commercial, translator and converter software available on the web. Some GIS programs support this type of conversion also; for example, the conversion tool available in ArcGIS can be used to switch between a number of formats.

    Raster-to-Vector & Vector-to-Raster

    Moving from vector to raster is not that difficult. A line or polygon is simply given a pixel value. The opposite is not true though. The problem is that one line might be several pixels wide, therefore one has to skeletonize the line, often leaving it very jagged. This is a time consuming and complicated procedure. Sometimes it is impossible to exchange, and one cannot move between the formats. If this is the case, the map has to be re-digitized. In other instances, there is just a poor translation, and data is lost in the exchange.

    Data Standards

     Government Standards

    The Federal Information Processing Standard 173, called the spatial data transfer standard (SDTS), was established for the exchange of data between different formats. It is extremely complicated because it has to produce a bibliography, a terminology, and a complete list of geographic and map features. It also has to address the issue of data accuracy.

    Industry Standards

    Two major points can be made about the industry. The first is that none of the industry standards exchange topology with the data; they only transfer the graphic information. The second point is that with many different formats each package has to include a large number of format translators.

    Open GIS Consortium

    The Open Geospatial Consortium, Inc. (OGC) is a non-profit, international, voluntary consensus standards organization that is leading the development of standards for geospatial and location based services. Through member-driven consensus programs, OGC works with government, private industry, and academia to create open software application programming interfaces for geographic information systems (GIS) and other mainstream technologies.

    GML or Geography Markup Language is an XML based encoding standard for geographic information developed by the OpenGIS Consortium (OGC). The objective is to allow internet browsers the ability to view web based mapping without additional components or viewers.

    1. Knowledge of data quality, including geometric accuracy, thematic accuracy, resolution, precision, and fitness for use

    1. Geospatial Data

      1. Knowledge of metadata and its standards (e.g., ISO and FGDC)

    Content Standard for Digital Geospatial Metadata (CSDGM)

    • ISO 19115:2003 Geographic information – Metadata (corrigendum 1): The base ISO metadata standard for the description of geographic information and services. Expected to be replaced by ISO 19115-1:2014 – Geographic Information – Metadata – Part 1: Fundamentals.

    • ISO 19115–2:2009 Geographic information – Metadata – Part 2: Extensions for the description of imagery, gridded data and data collected using instruments, e.g. monitoring stations and measurement devices. These extensions also include improved descriptions of lineage and processing information. 

    • North American Profile (NAP) of ISO 19115: A US and Canada joint profile of ISO 19115:2003 that extends some domains, increases conditionality for some elements, and specifies best practices for populating most elements.

    • ISO 19110:2005 Geographic information – Methodology for Feature Cataloging: An affiliate standard that supports the detailed description of feature types (roads, rivers, classes, rankings, measurements, etc.) in a manner similar to the CSDGM Entity/Attribute Section. The standard can be used in conjunction with ISO 19115 to document geospatial data set feature types or independently to document data models or other feature class representations.

    • ISO 19119:2005 Geographic information – Services – Amendment 1: Extensions of the service metadata model An affiliate standard that supports the detailed description of digital geospatial services including geospatial data portals, web mapping applications, data models and online data processing services. The standard can be used in conjunction with ISO 19115 to document services associated with a specific data set/series or independently to document a service.

    • ISO 19139:2007 Geographic information — Metadata — XML schema implementation: An XML document that specifies the format and general content of an ISO 19115 the metadata record. 

    Data Type – The type of geospatial resource you document will affect your standard selection.

    • CSDGM was developed for the documentation of GIS vector, raster and point data.

    • ISO 19115 was developed for the documentation of GIS vector and point data and geospatial data services such as web-mapping applications, data catalogs, and data modeling applications.

    • ISO 19115-2 fully includes ISO 19115 and adds elements to describe imagery and gridded data as well as data collected using instruments, e.g. monitoring stations and measurement devices.

    metadata represents the who, what, when, where, why, and how of the resource

    -include core library catalog elements such as title, abstract, and publication data

    -geographic extent and projection information 

    -database elements such as attribute label definitions and attribute domain values

    1. Understanding of the difference between quality control and quality assurance in the context of a given geospatial project


    Quality Control – used in developing systems to ensure products or services are designed and produced to meet or exceed customer requirements

    Quality Assurance – refers to planned and systematic production processes that provide confidence in a product’s suitability for its intended purpose – set of activities intended to ensure that products satisfy customer requirements in a systematic, reliable fashion. QA cannot absolutely guarantee the production of quality products

    two principles – QA 

    Fit for purpose – the product should be suitable for the intended purpose

    Right first time – mistakes should be eliminated

    Quality Assurance is process oriented and focuses on defect prevention, while quality control is product oriented and focuses on defect identification.

    Quality Assurance – QA is a set of activities for ensuring quality in the processes by which products are developed

    Quality Control – QC is a set of activities for ensuring quality in products – the activities focus on identifying defects in the actual products produced


    QA is a set of activities for ensuring quality in the processes by which products are developed.

    QC is a set of activities for ensuring quality in products. The activities focus on identifying defects in the actual products produced.

    Focus on

    QA aims to prevent defects with a focus on the process used to make the product. It is a proactive quality process.

    QC aims to identify (and correct) defects in the finished product. Quality control, therefore, is a reactive process.


    The goal of QA is to improve development and test processes so that defects do not arise when the product is being developed.

    The goal of QC is to identify defects after a product is developed and before it's released.


    Establish a good quality management system and the assessment of its adequacy. Periodic conformance audits of the operations of the system.

    Finding & eliminating sources of quality problems through tools & equipment so that customer's requirements are continually met.


    Prevention of quality problems through planned and systematic activities including documentation.

    The activities or techniques used to achieve and maintain the product quality, process and service.


    Everyone on the team involved in developing the product is responsible for quality assurance.

    Quality control is usually theresponsibility of a specific team that tests the product for defects.


    Verification is an example of QA

    Validation/Software Testing is an example of QC

    Statistical Techniques

    Statistical Tools & Techniques can be applied in both QA & QC. When they are applied to processes (process inputs & operational parameters), they are called Statistical Process Control (SPC); & it becomes the part of QA.

    When statistical tools & techniques are applied to finished products (process outputs), they are called as Statistical Quality Control (SQC) & comes under QC.

    As a tool

    QA is a managerial tool

    QC is a corrective tool

    1. Knowledge of data archiving and retrieval

    -provides a mechanism for capturing, managing, and analyzing data change

    -creates and maintains a separate feature class schema associated with the versioned geodatabase

    -when enabled, maintains all changes saved or posted to the DEFAULT version in an associated archive class

    -enables temporal analysis of geospatial resources over time

    1. Knowledge of the differences among a join, a merge, a union, a clip, and an intersect

    join – multiple tables duplicating information in the database – 

    Merge – Combines multiple input datasets of the same data type into a single, new output dataset. This tool can combine point, line, or polygon feature classes or tables.

    Use append tool to combine input datasets with an existing dataset

    Union – Computes a geometric union of the Input Features. All features will be written to the Output Feature Class with the attributes from the Input Features, which it overlaps.

    Clip – Extracts input features that overlay the clip features.

    Intersect – Computes a geometric intersection of the input features. Features or portions of features which overlap in all layers and/or feature classes will be written to the output feature class.



    Symmetrical Difference



    1. Knowledge of basic Geomatics

    Geomatics – Geomatics is the science and technology of gathering, analyzing, interpreting, distributing and using geographic (or spatially referenced) information. Geomatics encompasses a broad range of disciplines: Cartography, surveying, mapping, remote sensing, GIS and GPS

    Tools and techniques used in land surveying (Total Station, Level Machine, Theodolite, Plane Table, Chain etc.) , remote sensing, GIS, global navigation satellite systems (GPS, GLONASS, GALILEO, COMPASS), photogrammetry, and related forms of earth mapping

    GLONASS – Russian

    Upcoming Galileo positioning system

    proposed COMPASS navigation system of China

    IRNSS of India

    1. Knowledge of basic field data collection  

    check out data collection guidelines at BC

    Leave a Reply

    Your email address will not be published. Required fields are marked *