A GIS Analysis: Sewer System Infrastructure Management Application
by maximizing the use of Geocoding Address and Google Earth™
Introduction
Geographic Information System (GIS) has played important roles in the last 15 years with its capabilities of combining the spatial and non-spatial analysis (Longley, 2005) . ArcGIS, as the most popular GIS software in the world has been used widely by academia, industrial companies, and public sector. ArcGIS has abundance of powerful tools that can be used in every field related to spatial and non-spatial analysis. One of the tools that can be used in the infrastructure management area is geocoding address (Ormsby, Napoleon, Burke, Groessl, & Feaster, 2004) . Moreover, ArcGIS also has the capabilities to convert and share files with other widely-known applications such as Google Earth™, Google Map™, or Yahoo Map™.
Geocoding address is a very powerful tool developed by ESRI to extract non‐spatial information (i.e., address information) into spatially visualizes information to perform analysis accurately (Ormsby, Napoleon, Burke, Groessl, & Feaster, 2004) . The geocoding tool in ArcGIS is a really useful tool to support the application of infrastructure management, for example by mapping the address information on the attribute table format which also contains information about the sewer system condition, citizen’s complaints, type of defects, etc into ArcMap. Thus engineers can perform required analysis to deal with the maintenance work, prioritization, and or budget allocation.
Furthermore, to publish and share the results of a GIS infrastructure management application, one effective and free way is by using Google Earth™ capabilities. Since GIS and geospatial data come in hundreds of file formats and from many organizations worldwide. It is essential that ArcGIS shapefiles can be transferred or converted into various formats. Fortunately, ArcGIS supports many of these formats directly using out‐of‐the‐box tools and format convertors. Thus the converted file (in KML) can be distributed freely to non-GIS users because they can open it on Google Earth™ application which is available to be downloaded for free.
Literature Review
Provide a better infrastructure management system has become major concerns in the industrial area and municipal governments because it can save time, cost, and efforts. Based on Venigalla & Baik (2007), most municipalities assume that integrating GIS into their well-establish systems will possibly cause many obstacles such as limited budget and the treatment of GIS as capital expenditures. Thus, by showing the benefits of the implementation of GIS in Engineering Management Service Function (ESMF), it is expected that municipalities will consider the use of GIS in the future because GIS can make the task time shorter and reduce the overall cost. The article also provides an example of a case study about the use of GIS in automating the process of Sanitary Sewer Reimbursement (SSR) program in Fairfax County, Virginia.
By using GIS and integrated it to the SSR program, the productivity can be enhanced. There are several benefits by implementing GIS; first, it is easier to identify the location of the sewer system because elements such as sewer lines and manhole structure can be displayed as layers in GIS interface. Second, existing database can be joined to the GIS to make a better data query and estimation for analysis. Third, it is fewer engineers to perform tasks (estimation, calculation, and collection) with a shorter time compare to the traditional method. Finally, with its capability to work in the Visual Basic environment, the functions in ArcGIS can be customized to meet the agency needs (Venigalla & Baik, 2007) .
Thus, to enhance the City’s performance on dealing with sewer system infrastructure management, ArcGIS geocoding tool is used. By using geocoding tool, it is expected that it will help the effectiveness and efficiency of municipalities in handling sewer system infrastructure management. Because by identifying addresses information and mapped the address points on map, engineers can perform a better decision making process using the worst-first scenario method. The worst-first scenario method is used to assess the sewer lines problem surrounding citizen complaints addresses. It means that the area that has the most complaints is the area that needs to be inspected first.
Methodology
In this case study, I have attempted to create a realistic scenario as an engineer to maintain College Station’s sewer system. The worst-first scenario method is used to evaluate sewer lines problem surrounding citizen complaints addresses. It means that the area that has the most complaints is the area that needs to be prioritized in the inspection process. The complaint system is used to perform maintenance work, prioritization, and or budget allocation for underground infrastructure system (i.e., sanitary sewer system). The complaints information includes addresses, type of complaints, etc. For the spatial analysis purpose, the address information is mapped to ArcMap using geocoding process. Once the addresses have been mapped on ArcMap, the analysis can be performed. By knowing the location of complaints, we can predict which infrastructure facilities (i.e., sewer system) around the area that might have a problem. By performing this task, we can send engineers to perform assessment on only several sewer lines surrounding the area. This idea will save time, efforts, provide a better infrastructure management.
After all the complainers’ addresses have been mapped to the ArcMAP, the writer used ArcGIS capabilities such as buffer, intersect, and data summary analysis to find which sewer lines that most likely have a problem. Buffer, one of ArcToolbox tool is used in the address points which has been mapped onto the ArcMap to select sewer lines on the surrounding the area that might have a problem. In this case, the writer selected 200 feet buffer area around the Geocoding result. In addition, intersect tool in ArcGIS is used to create points that coincides the buffer area and the sewer lines. So, the numbers of points that lie on the sewer lines indicate the number of complaints to that particular line. It means that the sewer lines that have most points are the lines that needed to be prioritized on maintenance, budget allocation, and other infrastructure management related works.
Finally, the results on ArcGIS need to be published and distributed to the citizen. Free software that able to do this task is Google Earth™ application. To be able to transfer and convert the ArcGIS file format into Google Earth™ format, we need an ArcScript extension from ESRI. Fortunately, ESRI has a free downloadable extension to convert a shapefile (i.e., ArcGIS file format) to KML (i.e., Google Earth™ format). This extension also has the capabilities to show one field of the shapefile’s attribute table in Google Earth™.
Data Acquisition and Preparation
To perform the GIS analysis for this project, I chose the City of College Station for the study area. The data that are needed for this analysis are shapefiles of Brazos County boundary, College Station parcelized zoning area, Brazos County roads, and sewer lines. All the data needed for this analysis is acquired from the City of College Station website.
After all the data needed was downloaded, the next step was to prepare the data for the analysis. Based the information provided, the writer selected the NAD_1983_StatePlane_Texas_Central_FIPS_4203_Feet as a projection for the Data Frame of the analysis. This projection is used for the entire analysis. To avoid errors in the analysis, the writer have to make sure every shapefile require must have the same projection as the Data Frame. Thus, the projection of each shapefile needed to be checked whether or not it matches the Data Frame coordinate system.
To prepare the attribute table needed for the geocoding process, the writer conducted a survey on several streets within college station. The locations of the survey are on Francis Dr, Berkeley St, Dominik Dr, Fairview Ave, Luther St, Highlands St, and Monclair Ave. The purpose of the survey is to acquire address data needed for the scenario. Thus, the writer prepared an attribute table that has information about complainers’ name, address, date, type of complaints, etc. The table can be seen on Figure 1. This table will be used in the geocoding process in ArcGIS.
Figure 1 Complaints table for sanitary sewer system in the City of College Station.
Analysis and Procedures
A. Geocoding Address Analysis
1. Creating a New Address Locator
After all the data required for the analysis was prepared, I performed the analysis using ArcGIS. The first step of the analysis is geocoding addresses. To geocode address, there are two shapefiles that are mainly needed, which are the brazos county roads (i.e., bzroads) shapefile and complaints table.
Additionally, to geocode addresses, it is required to have a database of properly-formed addresses, a reference database of streets, and a set of rules for matching them. In ArcMap, this set of rules is known as a Geocoding Service. Fortunately, The US census bureau maintains a database of streets with address ranges which has been enhanced in many commercial geocoding products. Figure 2 shows the shapefile of the roads (bzroads) with its attribute table (Figure 3) that has address range.
Figure 1 Attribute table of a shapefile (Roads in Brazos County) that has address range.
Figure 3 Shapefile of roads in Brazos County.
Using the information provided in the table, the process of finding the location of an address within specific area can be perform using the geocoding system in ArcGIS.
To be able to geocode addresses, the first thing that is needed is an address locator tool. Address Locator is a tool in ArcGIS that convert textual descriptions of locations into geographic features. It is stored and managed in a desired workspace that we can choose. The workspace can be a file folder, file geodatabase, personal geodatabase or SDE geodatabase. In this case, an address locator is created in the working folder called GISProject_Summer. Address locator can be found in ArcCatalog by right clicking on the working folder as shown in Figure 4.
The next step is to choose the address locator style. There are several options for the address locator style and it can be selected based on data availability or analyst preference. In this case, US Streets with Zone is selected because it matches the attribute table of the shapefile (Figure 5).
Figure 4 Visualization of creating an address locator.
Figure 5 Address locator style selection for the new address locator.
On the Primary table tab, click the Browse button next to the Reference data text box. This will open the Choose Reference Data dialog box. In this case select the shapefile bzroads as it reference data. The shapefile contains information needed by address locator. As it shown in Figure 6, the Fields text boxes have been filled automatically with the attribute table fields. It can be also selected manually to match the standardization. Additionally, on the right side of the table (Figure 6) there are several tabs that deal with the sensitivity of the address locator technique and it can be modified. For example, we can change the spelling sensitivity to less than 80%. It means that the address locator will be “less sensitive” when they try to find the correct spelling of the address. For this project, the writer decided not to change the default setting on sensitivity options.
Figure 6 Fields text box have been filled with standardized address attribute.
After Address Locator created and all the data gathered, the process of geocoding can be performed.
2. Geocoding Addresses using Attribute Table Information
Geocoding Address tool is by mapping the addresses information stored in the attribute table to the ArcMap. Geocoding address process is a very powerful tool developed by ESRI to extract non-spatial information (i.e., address information) into spatially visualizes information to perform analysis accurately. For this case, the use of geocoding addresses is to prioritize maintenance of the underground infrastructure system (i.e., sewer system) in the city of College Station. To geocode addresses based on the attribute table information (in this case the table is complaint table), there are several steps that need to be perform.
First, on the drop down menu, click Tools, point to Geocoding, and then click Geocode Addresses. The box dialog will appear (Figure 7). On the window dialog, add the required table (i.e., Table2). On the Address Input Fields tab, add the Address field for the Street or Intersection text box. Because on the attribute table (i.e., complaint table) there is no information about zone (zip code), so we can leave it blank. Furthermore, define location information to save the output shapefile of the geocoding result. Unlike using Find tool, this technique store the geocoding result in shapefile format, so we can reuse the information every time we need them.
Figure 7 Windows dialog for creating the geocoding result.
Second, the Review/Rematch Address dialog box appears (Figure 8). In the dialog box we can see the Statistics tab that shows the information about how many address that belongs to the several matching criteria including unmatched addresses. Additionally, the dialog box also provides several rematch criteria that we can choose. Furthermore, we can select to rematch address interactively or automatically. For this case, the writer chooses to rematch the address interactively because we can see the process of re-matching the addresses. The addresses that will be rematch are the unmatched addresses.
Figure 8 Window dialog for creating Review/Rematch Addresses
To have higher percentage of score we can modify the current address. Click Modify button under the Standardize address tab (Figure 9). Edit Standardization window will show and we can make changes from here. By correcting the address, it does not mean that the address in the attribute table will change. It only correcting the address in ArcMap so it can creates the right point in the map. We can see in Figure 9 the score now becomes 100%. The score sometimes vary depends on the level of detail of the address. For example, by changing the name from “Dominic” to “Dominik” and “St” to “Dr” without adding the zip code (i.e., zone), the score would be 66%. But by adding the zip code, the score became 100%.
Figure 9 Address correction process by using Interactive Review window dialog.
Once all the addresses have been matched, it will create shapefile points of all addresses. The result can be seen on Figure 10. On Figure 10, the red points represent the addresses based on the attribute table. For the complete explanation about geocoding addresses, please refer to Appendix A: Step by Step in Geocoding Addresses.
Figure 10 Geocoding result for the complaint table of the City of College Station.
B. Infrastructure Management Application
Once the addresses have been mapped on ArcMap, the analysis can be performed. By knowing the location of complaints, we can predict which infrastructure facilities (i.e., sewer system) around the area that might have a problem. This idea will save time, efforts, provide a better infrastructure management. In this case, the writer selected 150 feet buffer area around the Geocoding result. In addition, intersect tool in ArcGIS is used to create points that coincides the buffer area and the sewer lines.
Furthermore, the next important analysis to perform is to recognize which of the segments that need to be prioritize. One way to perform is task is by recognizing sewer lines that is “touched” most by the buffer result. It means the sewer lines that have the most “touched points” are the lines that receive the most complaints and should be pay attention more. So, the numbers of points that lie on the sewer lines indicate the number of complaints to that particular line. It means that the sewer lines that have most points are the lines that needed to be prioritized on maintenance, budget allocation, and other infrastructure management related works. The result can be seen in Figure 11.
Figure 11 Buffer and Intersection result that create points coincide with the sewer lines.
To identify which of the sewer lines that have the most complaints, the writer summarize the attribute table of the sewer lines based on its identification number, thus the city engineer can prioritize the sewer line (s) for maintenance work. On this case study, the field that needed to be summarized is ACCT_SWR_ID (Figure 12). The summary result can be seen on Figure 13. From the result, we can conclude that sewer line with identification number 641 and 1444 are the two most frequent sewer lines that intersect with the buffer shapefiles (i.e., the lines that have the most points).
Figure 12 Fields on the attribute table of the single point shapefile.
Figure 13 Summary for the SinglePoints shapefile.
Although the summary result has defined the lines that have the most points, it is not connected to the map and we cannot locate the location of the sewer lines. To be able to locate the location on the map, join or relate table is necessary. In this case, the writer chose the “relate table” technique. There are two tables that need to be related. The first attribute table is the SewerProblem shapefile (i.e., sewer lines that potentially have problems). The second table is the summary table (i.e., Sum_Output table) that has information about which sewer line has the most points. Figure 14 shows the relate process.
Figure 14 Joins and Relates process of two tables.
Once the two attribute tables have been related, they are connected to each other (Olivera, 2008) . On other words, if we clicked on one of the sewer identification number on the Sum_Output table, it will located the sewer line on the map because we have been related the table with the SewerProblem shapefile (Figure 15).
Figure 15 Result from the process of relating two attribute tables.
The complete explanation on how to precede this task is explained thoroughly in Appendix B: Infrastructure Management Application.
C. Publishing Results on Google EarthTM
ArcGIS contains optional software that can extend this support for working with many more GIS data formats (An overview of KML support in ArcGIS, 2009) . This enables ArcGIS to identify dozens of additional non-native formats and allows working with them directly; just as if we work with native ArcGIS formats. The Data Interoperability extension also gives the ability to define new custom data sources and data transformation procedures that help performing advanced data transformations between a variety of GIS and tabular data structures. In this case study, since Google Earth™ is using KML or KMZ format, an ArcGIS shapefile must be converted so the result can be displayed on Google Earth™.
Keyhole Markup Language (KML) is a language that allows presenting GIS data as a series of graphics within Google Earth™, Google Maps™, and other web-based mapping applications that support KML (An overview of KML support in ArcGIS, 2009) . Additionally, we can define how to explore and interact with KML elements within the Google Earth™ and or Google Maps™ context. Furthermore, ArcGIS supports a number of KML key capabilities such as point feature, line feature, polygon feature, imagery, and map document.
In ArcGIS (version 9.2 and 9.3) there are several ways to convert shapefile (SHP) to KML. Two ArcGIS versions, ArcGIS 9.2 and 9.3, present two different ways of converting SHP to KML (or KMZ). Additionally, since ordinary conversion tools in ArccGIS have limitation which is unable to show the labels of attribute table, the free extension from ESRI website is obtained. However, in Appendix C: Publishing Result on Google Earth™ the writer provides a complete information on how to corvert SHP to KML using both ArcGIS conversion tool and ArcScript extension. In this case study, since the writer uses ArcGIS version 9.2, the conversion tool in ArcGIS 9.2 was used.
The extension that the writer obtained from the ESRI website is Export to KML 2.5.4. This extension is developed for ArcGIS 9.x by the City of Portland, Bureau of Planning. The extension allows ArcGIS users to export GIS data in KML format for viewing in Google Earth™ (ArcScripts, 2009) . Any point, polyline, or polygon dataset, in any defined projection, can be exported. Export to KML 2.5.4 also supports some features such as: ability to incorporate ArcMap layer symbology into the exported KML; labeling of point, line, and polygon features; "describe" individual features using the database attributes, store database attributes as "schema" items. Before begin working with the Export to KML 2.5.4 extension, there are several steps that need to be performed, including download the file from ESRI website and install it in the computer. All these detail steps will be presented in Appendix C: Publishing Result on Google Earth™.
After the installation process completed, the conversion process using Export to KML 2.5.4 extension can be performed. Before working with the extension tool, first we need to activate the extension. Under the drop down menu, select Tools and click to Extensions. The Extensions window will pop up. On the Extensions window dialog, we can see several extensions available including the newly added extension, Export to KML (Figure 16). To activate the extension, check on the box beside the Export to KML extension.
Figure 16 List of extensions in ArcGIS.
Figure 17 Activating the Export to KML tool using toolbars from the drop down menu.
In addition, to pop up the Export to KML tab, under the view drop down menu, select Toolbar and checked the Export to KML (Figure 17). We can see the Export to KML tab appears with the Google Earth™ icon. To convert the shapefile to KML using Export to KML extension, click on the icon (i.e., Google Earth™ logo) and the Export to Google Earth KML window will appear.
Moreover, on the Export to Google Earth KML window, choose the shapefile to be converted under the Select the Layer to Export tab. We can also select what attribute is used for labeling features (i.e., ACCT_SWR_ID). Furthermore, define the location for saved KML file under the Name and Location of KML Output tab (Figure 18).
Furthermore, on the Export to Google Earth KML window click Options button that pop ups new window called Export to KML (Options) as shown in Figure 19. This window enables user to customize the KML attribute such as layer transparency, feature description, etc. For this exercise, the writer decided to leave it default.
Figure 19 Options on Shapefile to KML that can be customized.
Finally, all the SHP data that has been converted to KML are ready to be opened in Google Earth™. Using this extension, every file that has been converted can be automatically opened in Google Earth™ if we have the Google Earth™ application installed in our computer. The Google Earth™ view can be seen in Figure 20.
Figure 20 KML result in Google Earth view.
Conclusion
By performing this case study, the writer can conclude several important points. First, geocoding technique is one of ArcGIS powerful tools to support infrastructure management asset especially on the sewer system. This is because the sewer system is an underground infrastructure system that is difficult to visualize. Thus, using the citizen’s complaint system that enhanced by ArcGIS’s gecoding tools, the city engineer can perform a worst-first scenario to fix a problem or perform maintenance to the areas that have the most complaints.
Second, to support the maintenance or reconstruction work in more level of detail, city engineer can use ArcGIS tools such as buffer, intersect, and table summary analysis to define which segments in the sewer lines that potentially have problems. Since it is difficult to make a decision and analysis for infrastructure facilities that are buried underground, the GIS analysis to sewer system are in fact helpful to save cost, money, and effort. Which means that the city engineer does not have to perform inspection for the whole neighborhood, instead they just need to perform inspection on several sewer lines as recommended using GIS analyses.
Third, for accountability, the GIS result need to be published and distributed freely to the citizen. One effective way to distribute the file is by using Google Earth™ application that can be downloaded freely from the internet, so the non-GIS users can also see the analysis result. In this case study, since Google Earth™ is using KML or KMZ format, an ArcGIS shapefile must be converted to KML or KMZ format so the result can be displayed on Google Earth™. The only concern on the converting process is the coordinate system. To be able to align the converted file perfectly on Google Earth™ application, ArcGIS shapefiles have to be in its original coordinate system. If the shapefiles not in its original coordinate system, the conversion result in Google Earth™ can be dislocated far from its original geographical location.
Finally, the ArcScript extension from ESRI is a very useful and powerful tool to convert shapefile to KML because it has the capabilities to customize the KML file such as adding labels, information, transparency, etc. On the other hand, since the extension is distributed freely, it also has limitations such as the restriction to only show one field of attribute table for each KML file. However, the limitation is not significant since the purpose of sharing and publishing data is to accountability, not to perform a complex analysis.
References
An overview of KML support in ArcGIS. (2009). Retrieved August 07, 2009, from ArcGIS Desktop 9.3 Help: http://webhelp.esri.com/arcgisdesktop/9.3/index.cfm?TopicName=An_overview_of_KML_support_in_ArcGIS
ArcScripts. (2009). Retrieved August 07, 2009, from ESRI Support Center: http://arcscripts.esri.com/scripts.asp?eLang=&eProd=&perPage=10&eQuery=kml
College Station GIS Services. (2008). Retrieved November 9, 2008, from City of College Station Website: http://www2.cstx.gov/gisdownloads/default.asp
GIS Map of Brazos County. (2008). Retrieved November 9, 2008, from Brazos County Website: http://www.co.brazos.tx.us/maps/mapzoom/
Longley, P. A. (2005). Geographic Information Systems and Science. London: John Wiley & Sons, Ltd.
Olivera, F. (2008). CVEN 658 Course Notes. College Station: Zachry Department of Civil Engineering, Texas A&M University.
Ormsby, T., Napoleon, E., Burke, R., Groessl, C., & Feaster, L. (2004). Getting to Know ArcGIS Desktop. Redlands: ESRI Press.
Venigalla, M. M., & Baik, B. H. (2007). GIS-Based Engineering Management Service Functions: Taking GIS beyond Mapping for Municipal Governments. Journal of Computing in Civil Engineering , 21 (5), 331-342.