How to Make Maps Using Mapping Software (QGIS) (user search)
       |           

Welcome, Guest. Please login or register.
Did you miss your activation email?
March 29, 2024, 08:51:38 AM
News: Election Simulator 2.0 Released. Senate/Gubernatorial maps, proportional electoral votes, and more - Read more

  Talk Elections
  About this Site
  Talk Elections Q & A
  How To (Moderator: Virginiá)
  How to Make Maps Using Mapping Software (QGIS) (search mode)
Pages: [1]
Author Topic: How to Make Maps Using Mapping Software (QGIS)  (Read 22330 times)
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« on: October 31, 2017, 03:21:02 AM »

Some posters have asked me for a tutorial on how I make my election maps without tediously using paint. I've taught a few others before, so I thought I'd put up a post here.

You need a few things to make maps:

Quote
You must be logged in to read this quote.

Excel can open .dbf files, it has disabled writing/updating .dbf files. Microsoft claims that there is something fundamentally flawed with the format, such that they can't reliably produce .dbf files. dBase might have at one time been considered a competitor with Office. dbf files exist because that was what ESRI had available when they invented shapefiles. You really don't need to access the .dbf files directly.

Quote
You must be logged in to read this quote.
For this application I would use the cartographic boundary files. These have been simplified. The basic files have too much detail, and crooked boundaries such as rivers may be rendered as thicker lines.

2016 1:5,000,000 County Shapefile (.zip)

Quote
You must be logged in to read this quote.
How was the second column generated?

Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #1 on: October 31, 2017, 06:48:12 PM »

3) Open Office (free) or a similar program that allows you to save and open .dbf files. Excel doesn't.

Excel can open .dbf files, it has disabled writing/updating .dbf files. Microsoft claims that there is something fundamentally flawed with the format, such that they can't reliably produce .dbf files. dBase might have at one time been considered a competitor with Office. dbf files exist because that was what ESRI had available when they invented shapefiles. You really don't need to access the .dbf files directly.

Yes, there is a way merge the data using QGIS, but I always run into the leading zero issue. For some reason, QGIS tends to read the states with a GEOID of 01-09 as a number, mismatching those states (in other words, 01001 becomes 1001, and the data doesn’t merge). Do you have an easy fix for that?
Census ID's are both strings and numbers, but can be manipulated as numbers. The county ID is composed of a State ID and a County ID within a state. Harris County, Texas is "48201", where "48" is the state code for Texas, and "201" is the code for

The DBF file keeps the field type as string. The csv files produced by the Census Bureau for American Fact File does not have data types. You can think of a csv file as a crude form of data base that does not include metadata such as data types. When you open a csv file into Excel it determines data type by looking at the data values. 48201 and 01001 look like numbers, and stored as 48201 and 1001, and displayed as numbers (right justified by default).

There is another problem with leading zeros in GQIS, because of its unixy/c background. A number with a leading 0 denotes an octal number:  030 = 24 is TRUE.

Anyhow what we need to do is store the ID as a string, which is easy enough by using the expression
=TEXT(A1,"00000") which converts a number to a string with at least 5 digits including leading zeros if necessary.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #2 on: October 31, 2017, 07:08:32 PM »

9) Open tl_2017_us_county.shp in QGIS. You can probably do this by clicking on the file. If not, open QGIS, and either hit CTRL-Shift-V or select Layer-->Add Layer-->Add Vector Layer from the drop-down menu, as you can see in the image below:

I would do this step first. I normally open shapefiles by dragging them into QGIS.

So once you have downloaded QGIS and installed it:

(1) Open QGIS.
(2) Click on the blank page icon in the upper left corner to open a new project, and drag the .shp file into the map panel (big white space).
(3) You have drawn your first map.
(4) Let us know that you are reading this.


Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #3 on: November 01, 2017, 04:16:33 AM »

Census ID's are both strings and numbers, but can be manipulated as numbers. The county ID is composed of a State ID and a County ID within a state. Harris County, Texas is "48201", where "48" is the state code for Texas, and "201" is the code for

The DBF file keeps the field type as string. The csv files produced by the Census Bureau for American Fact File does not have data types. You can think of a csv file as a crude form of data base that does not include metadata such as data types. When you open a csv file into Excel it determines data type by looking at the data values. 48201 and 01001 look like numbers, and stored as 48201 and 1001, and displayed as numbers (right justified by default).

There is another problem with leading zeros in GQIS, because of its unixy/c background. A number with a leading 0 denotes an octal number:  030 = 24 is TRUE.

Anyhow what we need to do is store the ID as a string, which is easy enough by using the expression
=TEXT(A1,"00000") which converts a number to a string with at least 5 digits including leading zeros if necessary.

I just tried that and still end up with NULLs in the 01-09 states' counties when trying to join the =Text column with the GEOIDs from the original .dbf in QGIS.  Is this a limitation in QGIS, or am I doing something wrong?

Edit: I think I figured out a fix by virtually creating another GEOID column, GEOIDFIXED, defined as a text column, using the following formula, then joining GEOIDFIXED and the GEOID column from the .dbf file:

if(length("GEOID,C,5")=4,concat('0',"GEOID,C,5"),"GEOID,C,5")

This seems almost as complicated to explain as just manually manipulating the .dbf file, though - especially when we get to GEOIDs that are longer than 5 characters.
Did you create a layer for the .csv file?

This is what I did. I got a (.csv) file from American Fact Finder with the 2016 county estimates. I converted (saved) this as an .xlsx file (countychange.xlsx). I'll use this as my working file.

I added four columns:

Q3 = TEXT(B3,"00000")
R3 = E3             ! Population April 2010 Base
S3 = L3             ! Population July 2016 Estimate
T3 = S3/R3 - 1   ! Relative change 2010 to 2016

Copy the formula down the columns. I Changed the format of column T values to % with two decimal places.

Create a blank spread sheet, with four columns. In Row 1, add column titles:

ID, POP10, POP20, INCREASE

Copy columns Q,R,S,T from countychange.xlsx and paste values to columns A, B, C, D of new spreadsheet. This loses the formatting of the increase column, but that is OK.

Save new spreadsheet as .csv file countychange.csv.

Create a file with the type of .csvt countychange.csvt.

This is a text file that can be created in Notepad, that provides the format of each field of the .csv file. The contents of the .csvt file should be:

"String","Integer","Integer","Real"

If you have difficulty creating the .csvt file, create it as a .txt file and rename it in File Explorer. The name of the .csv and .csvt file must match.

Do the following:

(1) Open QGIS
(2) Create a new project using the blank page icon in the upper left.
(3) Drag the county .shp file into the map area.
(4) Save the project as, using the diskette icon in the upper left. Make sure you select a directory you can find. This will create a .qgs file which is your project context. Be sure to save this from time to time. If you exit QGIS (or worse it crashes), you can get back to where you were at.

Since you now have a map, we should probably show how to move around, select, and set the projection. But for now I will skip to the join.

(5) Open the .csv file as a layer. You can click on Layer > Add Layer > Add Delimited Text Layer; but it is far easier to click on the Comma icon in the far left of the screen.

(5a) Browse to the .csv file.
(5b) Click on the No Geometry radio button (near the mid-bottom right)
(5c) Click on the OK button.

(6) Join the csv layer to the county layer.

(6a) Right Click on county layer, and select Properties
(6b) Select Joins
(6c) Select Green Plus Sign to add Join
(6d) Select layer to join (the csv layer)
(6e) Specify join field (ID is the name we gave to the first field in the csv file)
(6f) Specify the target field in the county layer (GEOID)
(6g) Click on OK

This completes the join.

(7) Verify Join.
(7a) Make sure county layer is current layer.
(7b) Click on light blue icon that looks like Vermont or Connecticut mortgage card in Monopoly.
(7c) The new fields should be shown.

Note that it is quite easy to update/make corrections/add fields.

(8) Select csv layer, right click and Remove, and confirm removal. This will eliminate joined fields.
(9) In xlsx file add new column U3 = S3-R3   ! Numeric change 2016-2010
(10) Copy and paste values from xlsx column U to csv column E. Add name of column in E1 (CHANGE)
(11) Save CSV file.
(12) Add field type to csvt file ,"Integer"

(13) Redo Join.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #4 on: November 01, 2017, 04:27:10 AM »

(6) Join the csv layer to the county layer.

(6a) Right Click on county layer, and select Properties
(6b) Select Joins
(6c) Select Green Plus Sign to add Join
(6d) Select layer to join (the csv layer)
(6e) Specify join field (ID is the name we gave to the first field in the csv file)
(6f) Specify the target field in the county layer (GEOID)
(6g) Click on OK

This completes the join.

(7) Verify Join.
(7a) Make sure county layer is current layer.
(7b) Click on light blue icon that looks like Vermont or Connecticut mortgage card in Monopoly.
(7c) The new fields should be shown.

Footnote: The county shapefile includes Puerto Rico and the island areas (Virgin Islands, Guam, Northern Marianas, and American Samoa). Since our xlsx file does not include data for these areas (when I downloaded county estimates, I did US only) the joined fields for the territorial areas will be NULL.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #5 on: November 01, 2017, 01:26:14 PM »

Did you create a layer for the .csv file?

Yes, I created a layer for the .csv file - but I didn’t know to create a .csvt file. I’ll try that when I get to my computer tonight to confirm it works. I’m sure it does.
If there is no corresponding .csvt file, it uses a default for the data type (I think).

If you right click on a layer, and select Properties > Fields, it will show you the data types for the fields. QGIS might try to figure out data types based on data values. Since the IDs look like integers, the field would be assigned an integer.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #6 on: November 01, 2017, 02:26:54 PM »

The next few messages will show how to project the data, and do some navigation and selection, and some basic editing.

If you drag the county shp file into the map area it should look like this: (to see this image full size, right click on the image, then select "View Image")



When you load a shp file, QGIS assigns a random color. It is like a box of chocolates, you never know what you are going to get.

Census coordinates are in longitude and latitude. Longitude west of Greenwich, and Latitude south of the equator are negative values. Because this shp file includes Alaska, the western Aleutians have positive longitudes, east of Greenwich. The largest X values are displayed on the far right.

This shp file also includes the island areas. Since Guam and the Northern Marianas are in the Eastern Hemisphere, they are also shown to the right (southwest of the western Aleutians. American Samoa is in the southern hemisphere, but east of 180W latitude, so it is displayed on the lower left, SSW from Hawaii (you may need to clean your screen). Puerto Rico and the US Virgin Islands are to the southeast of Florida.

When I took this screen shot, the cursor was pointing roughly towards Houston, and the Coordinates at the bottom: -95.9, 30.6 show where I was pointing (95.9W, 30.6N)
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #7 on: November 01, 2017, 08:11:41 PM »

As expected, I got the join to work with a .csvt file.

Do you mind if I put the relevant parts of Reply #7 in the first thread as the first of methods to make maps (the other being mine)? I'll add screenshots to illustrate.

Once you're done with your basic overview of QGIS, I'd like to add responses on how to calculate the WINCODE, how to use the Print Composer to make professional-looking maps, and (perhaps) how to make map gifs. (It doesn't seem like we have much of an audience for this How To, though).
OK.

Maybe post links in Political Geography and International Elections and?  I was not aware of "How To"
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #8 on: November 02, 2017, 03:22:03 AM »

Setting A Map Projection

Because the western Aleutians have a longitude east of Greenwich, and thus show up on the right side of the map, and using longitudes at more extreme latitudes causes east-west stretching and area exaggeration at higher latitudes we want to use a map projection.

To do this:

(1) Click on the cloverleaf in the lower right corner.

This screen will popup:



(2) Be sure On The Fly transformation is checked.

(3) There are a lot of transformations, so type "Albers" in the filter box to limit our choices.

(4) Scroll down to "North_America_Albers_Equal_Area_Conic" and click on it.

(5) Click on Apply and OK.

And you should see this:



This is a pretty good projection for continental-sized areas. Local directions are pretty good, and equal area is good for thematic maps. Over longer distances, directions are distorted (e.g. twisting of Alaska) and location of American Samoa and the Marianas.

For presentation quality maps we might want to use different projections for different maps. There are pre-defined Albers Conic projections for the Continental USA, Alaska, and Hawaii.

Saving a Project

We can save a project by clicking on the diskette icon at the upper left. The first time, you will be asked to select a directory. I ordinarily put all my files for a project including shp files, Excel files, text files, image files, etc. in a single directory.

You can also save a project by typing <CTRL>S.

After saving a project, if you exit QGIS, when you restart QGIS you can open an existing project (and you will be shown your recent projects). Double click on the project that you want and QGIS will open to the state where you left off.

It is a good idea to save a project after you have completed some work. Note that saving a project does not save the data files (shp and dbf) associated with it.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #9 on: November 02, 2017, 11:48:55 AM »

I found out on Twitter that we needn't even do the .csv step. QGIS can export an Excel .xlsx file as a vector file. That seems to import the ID column directly, as a text file.

jimrtex - do you know of any drawbacks to this? If not, I'll further simplify the preferred method instuctions if I have time tonight.
QGIS uses a lot of Python plugins, for things like file import/export. I did not know that you could use xlsx files. I did some googling, and there was a tutorial about importing a lay from a csv file, which then mentioned that you could use an xls file, but that it would not import expressions. But I just tried it with an xlsx file and it did import the values of expressions.

I took a Census American Fact Finder file that had the 2016 county estimates, and which I had added some columns, such as calculating the rate of change from 2010 to 2016. I was able to import this into QGIS. It appears that it is keeping the two header rows from the xlsx file as data rows in the QGIS layer, and is confusing some data types. But this might be able to save a step. Personally, I prefer the csv files that I import into QGIS to be limited to a few pertinent columns and exclude intermediate calculations, but it might still be possible to use a simple spreadsheet that is linked to my main spreadsheet.

This might have application for drawing redistricting. I do my redistricting in a spreadsheet, where I can assign a district number to each county, etc., and then compute things like population for districts, and things such as racial composition. I then import this into QGIS to produce district maps. But it could be possible to let the district assignments to be made in QGIS and these exported to a spreadsheet.

I also exported the shapefile/dbf as an xlsx file. It looks OK. I'm not exactly sure I understand why you need to export the dbf file. In my applications, both the shapefile/dbf files and the census data files have the ID so that they can be joined.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #10 on: November 02, 2017, 08:27:09 PM »
« Edited: November 03, 2017, 01:11:40 AM by jimrtex »

Navigation in QGIS



These show some of the tool icons that I typically use in QGIS. You can hover with the cursor above the icon to get a tool tip.

Pan and Zoom

The magnifying glass with a plus sign can be used to zoom in on an area. Select the tool, then with the cursor at the upper left corner of the area you want to zoom to, click-and-hold while you move the cursor to the lower right corner of the area you want to zoom to. As you do so, a rectangle showing the area to be zoomed to will be displayed. Release the mouse button, and the area will be zoomed to. You can use any diagonally opposite corners of the zoom box.

The magnifying glass with a minus sign can be used to zoom out on an area. Select the area to zoom out on, in the same way you zoom in. The amount of zooming is based on the area you select. Select a small area, and the zoom out will be large. Select most of the map area, and the zoom out will be minimal, in effect just expanding the area displayed outward a bit.

The hand icon is used to pan the map. Select the tool, and then while clicking-and-holding, drag the map to where you want it. You can also use the scroll wheel to zoom in or out.

Using the zoom and pan functions, we can focus on the contiguous United States.



The rest of the map is still there, which can be confirmed by using the pan option to drag Seattle down towards the center of the map.

Sooner or later, you will pan and zoom so that the map is centered on Buenos Aires (and thus totally blank), and no amount of panning and zooming will bring the United States back into focus. Click on the magnify glass in front of the white paper and you will show the entire layer.

You can also zoom to selected feature(s). Here we have selected Harris County, Texas. Selected features are shown in yellow.



Then click on the magnifying glass in front of the yellow paper and it will zoom to the familiar outline of the home county of the 2017 World Series Champions


Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #11 on: November 02, 2017, 08:36:16 PM »

The only real reason I developed my method was because I never really knew how to join sheets until your Hudson tutorial, and ran into the leading zero issue when I otherwise tried it. It does have its benefits - I often use more than one map making program/website, and don’t have to do multiple joins across platforms. For example, I can just import the zipped shapefile & .dbf to Carto.com without having to use their join function, which sometimes misreads the column types and imports the GEOIDs as numbers or numbers as text. The data and data type is permanently in the .dbf file.

I’m going to try to update the first post tonight to reflect the possibility of just using the Excel file.
You can save a joined layer in QGIS. By default the joined layer is cached in memory, but you can force it to be stored on disk (be sure to browse to a directory where you can find the files).

Since the shp and dbf files are associated by position, I would be wary of getting a mismatch if you deleted or added features.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #12 on: November 03, 2017, 02:58:07 AM »

After you added labels, I would do the simple way to create a map: Project > Save an Image which creates a png file of the map window. That is what I always use (my use of QGIS is more as an interactive analysis tool).

I tried to use the Print Composer for Hudson, and could never get the Bing Street Maps to line up with the Census Shapefiles. If you use the Census Shapefiles for street features, each street fragment (a block line) is a feature. So you end up with a lot of labeled streets. I merged street features, but it was a tedious process.

One thing you might show, particularly for the Contiguous US, is how to turn the county boundaries off. That is an awful lot of black which overwhelms the colors. If you do that, then you might want to bring a States layer in, and show how to display layers that cover the same area.

I generally use the show all labels, but I generally only use county names when working interactively. If I am having trouble reading labels, I can just zoom in some more. I have used the expression left("NAME",2) to display the first two characters of a county name. But that is mainly when working in an area where all the counties look the same.

I did some googling, and someone was explaining how to label earthquakes based on their magnitude, which was an attribute. The largest earthquakes (> 6.5 magnitude) had their location; those > 4.5 had their magnitude shown, and smaller ones just had a symbol. There apparently is a way to indicate both what to display (content), and how to display it (style)

You can also define another pair of fields that have the offsets. So you may be able to move some labels so that Ketchikan Gateway Borough and Fairbanks North Star can be displayed.

An expression like this can be used.

if (left( NAME,1)='H', "NAME" ,"")

This frivolous example labels counties whose name begins with "H".

If you were displaying the contiguous US, you probably would not label all 3000+ counties, or even include the percentage change. That is why you are using a color theme. But you might want to display the fastest gaining counties, or the largest populations, etc. So you could simply calculate a variable that indicates which counties should be labeled.

if ("IMPORTANT" = 1, "NAME", "")

I just discovered something really cool. I was using American Fact Finder to get populations for county/congressional district fragments. But when use AFF to create a map, you can save it as a shapefile. The saved shapefile does not have the population attribute, but does include attributes for the CD and county numbers, and also an ID value. When you save the data values, it has the same ID, so they may be joined.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #13 on: November 04, 2017, 04:24:52 AM »


One thing you might show, particularly for the Contiguous US, is how to turn the county boundaries off. That is an awful lot of black which overwhelms the colors. If you do that, then you might want to bring a States layer in, and show how to display layers that cover the same area.

Yeah, we should probably add that in another tutorial. Lines can get annoying.

Is there a simplified state map somewhere on Census’ website, like there is for counties? I generally only have the tigerline files on my computer.
From the home Census Bureau webpage, select the Geography tab at the top, then Maps&Data. Select  TIGER Products, and then Cartographic Boundary Shapefiles. Incidentally, the Census Bureau uses:

TIGER/Line Shapfiles

Tigerlines may be a neologism.

There are quite a bit more maps than were available in the past. A weakness of using separate cartographic files is that they have been simplified independently, and may not match up perfectly (probably close enough for visual purposes. It could be better to generate a state file from the county file in QGIS. This is pretty easy to do for the shapes themselves, but the only useful thing left in the DBF file would the State ID. You could add any data, such as state names and populations easily.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #14 on: November 04, 2017, 06:16:44 PM »

Calculating Wincodes - Part 1: Manipulating Certain Types of Raw Data

Now that you know how to make a basic map, I will now move on to a lesson on how to calculate WINCODES to be used with the Atlas 8 Color Code color scheme. Part 1 is more a lesson about how to use Excel to make sense of poorly formatted election results than actually calculating Wincodes. If you already know how to do this, feel free to skip to part 2.
What are WINCODES? It would help me know what you are trying to make your xlsx file look like, and what you are ultimately try to produce. It would help to know what your shapefiles contain since we are trying to match them.

Particularly for exemplary purposes, I would focus on a single race. If you want to compare different results, you can always do multiple joins. I would have kept the Absentee, Questionable, and Early Voting separate by legislative district. Since Alaska does not attribute these votes back to election precinct, and they represent about 1/3 of votes, I would do an analysis of how these results varied by legislative district (e.g. share of in person votes, whether different types of votes were politically different. It might be possible to adjust the precinct in person votes. At minimum, you could aggregate results by legislative district.

Quote
You must be logged in to read this quote.
If you save the precinct results as a .csv file, you can open them directly in Excel. Note that "circle" is called a "radio button"

Quote
You must be logged in to read this quote.
Possibly premature to extract ID's, since this is where we merged all of the absentee, questionable, and early votes into a pseudo precinct called "Distri". If we kept them separate, we could create precinct IDs like 40-ABST, 40-QUES, 40-EARL which could then be aggregated into totals for the legislative district. It doesn't matter if not all data can be joined. Note, there are ways to check when there are shapefiles, but with no data to join with them.

If I had done it your way, I would type: "=LEFT(", then click on cell A1, and then typed ",6)" at the point where you select A1, press function key F4 and you can cycle through various absolute and relative expressions. $A1 might be a little bit better if you were going to also fill in columns to the right.

I would then do a GOTO  (G1:G56184) and pressed CTRL-D to fill down. CTRL-R fills right and you can fill up or left from the Fill command.

You could also do a CTRL-End to get to the end of your worksheet, click on G56184 (i.e. last row) and then SHIFT-CTRL-UPARROW to select the entire column followed by CTRL-D.

Quote
You must be logged in to read this quote.
I couldn't find Select Data-->Remove Duplicates.  Instead I clicked on the light bulb and typed "Remove Duplicates" and a menu popped up.

Quote
You must be logged in to read this quote.
I would have entered in B2

=SUMIFS(Data!$F$1:$F$56184,Data!$C$1:$C$56184,Sheet1!B$1,Data!$G$1:$G$56184,Sheet1!$A2)

And then used CTRL-R and CTRL-D to copy across and down. I had not understood why you also select on the Office because of the weird way they did write-ins. When I went back to correct for the write-ins I added the Office criteria for only the write-ins.

Quote
You must be logged in to read this quote.
This is critical. You'll never quite figure out the format that was used (Alaska included results for Senate District 5, rather than Senate District S). Not only did I not have the write-ins, I didn't understand that the "Distri" results were not district totals and repeated.

I've never found a simple way to do partial transposes. You might write a script, but by the time you got it working, you could have used a more brute force method. If every state used the same format, it might work to develop a more automated way.

Amother approach would be sort by Office, Candidate, and Precinct. You could then just copy the column for each candidate. If I did this, I would double check that the precinct order was the same for each column.

Quote
You must be logged in to read this quote.

This is very important step. We want the data in this format, as if the state had provided it this way. You might want to start this tutorial showing the desired end result. The rest is not necessarily showing how to do it, but to suggest a way short of timing in all the results by hand, which would discourage the attempt. Someone reading the tutorial could then skip to the next message.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #15 on: November 05, 2017, 12:00:51 AM »

Calculating Wincodes - Part 2: Calculating Wincodes in Excel

Republican     | PCT
Ind/Green      | PCT +1
Democrat       | PCT +2
Libertarian     | PCT +3
Others           | PCT +4,5,6 or 7.
Tie                | 9
No Data         | 10
Error             | 11

Ties are generally represented by light gray, no data by dark gray, and errors by white.

Thus, you need to know which candidates are Republicans, which are Democrats, and which are others when creating your Wincode formula. Depending on how the elections officials placed the candidates in the results, they are unique to each race.

I calculate Wincodes using a nested IF function. Here, we haven't precalculated the percentage for each candidate. We can do that - but need not here. It's a simple formula - ={Cell of candidate votes}/{Cell of total votes}.

Here are the formulas for the wincodes. Type (or cut and paste) the following formulas:

(3a) In Cell S2 (Senate Wincode):
=IF(MAX(B2:F2)=LARGE(B2:F2,2),9,IF(E2=MAX(B2:F2),E2/P2,IF(B2=MAX(B2:F2),B2/P2+2,IF(C2=MAX(B2:F2),C2/P2+3,IF(D2=MAX(B2:F2),D2/P2+1,11)))))

Explanation of the formula:
- Since we're not worried about precincts not yet reporting in the final results (in which case, the first IF would be "IF(P2=0,10" where P2 is the total votes cast, checking for ties comes first. MAX(B2:F2)=LARGE(B2:F2,2),9 compares the largest number of votes received in the Senate race against the second highest number of votes received. If they are equal, it's a tie, and QGIS will color the geometry as light gray.
- Next, we check to see if the Republican candidate, Sullivan, received the most votes. If that's true, the wincode is just the Republican's percent, E2/P2, which would give us a shade of blue in QGIS.
- Then, we check to see if the Democratic candidate, Begich, received the most votes. If that's true, the wincode is the Democrat's percent plus 2, B2/P2+2, which would give us a shade of red in QGIS.
- Fourth, we check if the Libertarian candidate received the most votes. If that's true, the wincode is the Libertarian's percent plus 3, C2/P2+3, which would give us a shade of yellow in QGIS.
- Fifth, we check if the Non-affiliated candidate received the most votes. If that's true, the wincode is that candidate's percent plus 1, D2/P2+1, which would give us a shade of green in QGIS.
- Last, if none of these are true, the wincode is 11, which would give us a blank white precinct, telling us something's wrong - either Write-ins won or we screwed up.

I would use a pair of lookup tables:

Begich DEM
Fish     LBT
Gianoutsos IND
Sullivan  REP
Write-in IND

REP 0
IND 1
DEM 2
LBT 3

I would use several columns.

The first would check for a tie, the same way you did.
The next would find the relative column of the winning candidate, where columns C:G have the votes of the candidates.

    =MATCH(MAX(C2:G2),C2:G2,0)

Name of the winning candidate.

    =INDEX($C$1:$G$1,1,M2)

Party of the winning candidate

    =VLOOKUP(O2,$O$8:$P$12,2)

Wincode base of the party:

    =VLOOKUP(P2,$O$14:$P$17,2)

Or you could add two rows, one for the party of the candidate, and the  other for the wincode base of the winning candidate. Since you know the column of the winning candidate for a precinct, calculating the percentage and the wincode base is trivial.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #16 on: November 05, 2017, 12:05:41 AM »

Calculating Wincodes Part 3: Drawing a multiple layer precinct election map in QGIS.

(6g) Check the box next to Custom field name prefix and delete the text in it.
I hadn't been doing this. The consequence is that the added field have a prefix, and there is a character limit for field names (I think in dBase). As a result the names of fields get truncated.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #17 on: November 05, 2017, 09:43:55 PM »

What are WINCODES? It would help me know what you are trying to make your xlsx file look like, and what you are ultimately try to produce. It would help to know what your shapefiles contain since we are trying to match them.

Particularly for exemplary purposes, I would focus on a single race. If you want to compare different results, you can always do multiple joins. I would have kept the Absentee, Questionable, and Early Voting separate by legislative district. Since Alaska does not attribute these votes back to election precinct, and they represent about 1/3 of votes, I would do an analysis of how these results varied by legislative district (e.g. share of in person votes, whether different types of votes were politically different. It might be possible to adjust the precinct in person votes. At minimum, you could aggregate results by legislative district.

I'll add a sentence describing WINCODES. It's already explained more in part two - In order to match the relevant QGIS winner color, we need to add an integer to the percentage of all those who won a precinct except Republicans (since the R color is in the 0-1 range). For example, a D winner has to have 2 added to their percentage if they win a precinct to generate a red color in QGIS using the 7 Color scheme.

I checked the shapefile before determining whether I had to truncate the precinct code. A link to the relevant shapefile is in Part 3 of the discussion. Unfortunately, the full precinct names from the Alaska Precinct shapefile are in terrible shape - some are capitalized, some have extraneous spaces, etc. I couldn't do a join on that column (NAME) without getting NULLs.

I wasn't planning on using the data to sum up the results by House District. You're right - that can be another lesson. I suppose we can do the remove duplicates first and then create a Precinct column in Sheet1 instead of doing it on the Data sheet. I'll rewrite the write-up to do that.
I would start with something like this:

In this tutorial we will show how to create maps of precinct-level election results. Each precinct will be colored to show the winning party, and their voting strength. For example, an inner-city precinct might be shown in a deep red, while a suburban precinct won by a plurality in a three-way race might be shown as light pink or light blue.

For this tutorial we need a precinct shapefile and the election data in a tabular form in a spreadsheet. The precinct shapefile for Alaska is here <link to file>. In Part 1 of this tutorial we will prepare the election data in a spreadsheet. Since Part 1 is not about QGIS per se, you may skip to Part 2 and begin with the spreadsheet prepared in Part 1.

Part 1: Creating Tabular Election Data for Use With QGIS.

Our xlsx file will have one row per precinct. Columns will have the precinct name or identification, as well as the votes cast for each candidate in that precinct. We begin with the election results from the Alaska Division of Elections <link>.

[Description of format and loading of file.]

Note: In Alaska, early, absentee, and questionable votes are not attributed to the election precinct of the voter, but rather to the legislative district. These votes represented about 30% of the total vote in 2014. We will not be able to display these results on our precinct-level maps, but we will retain them for use in future analysis (or tutorials).

<how to generate the base spreadsheet>

I would then split Part 3 into two. One would generate the IDs used for the Join. You could open the precinct shapefile to show the format, and explain how you are generating the ordinary ID. We want to retain the original precinct names as well since they include the polling place names. I used 01-EAR, 01-ABS, and 01-QUE, etc. for the other results.

Somewhere we can explain what happens when there is a mismatch between the join variable.

Quote
You must be logged in to read this quote.
Right Click on Properties, Select Join, Click on the + sign, Select the file to be joined, select the source and target variables, Click on OK. Repeat.

Then generate the maps.

If you had data for the 2016 election, there would be no need or reason to put it in a spreadsheet with the 2014 data.

Quote
You must be logged in to read this quote.
What's a GOTO?

That is how I generated the formula - moused over to A1, then hit 6. I wasn't planning on adding any additional columns, so I didn't see the need to cycle through the relative values (in fact, I didn't even know <F4> did that - I usually just manually type the dollar signs.
[/quote]
Ctrl G is the GO TO command. It is sometime useful for selecting long columns. Once you have determined that you want row 1 through row 56184, you can CTRL-G then enter G1:G56184 and those cells will be selected. I accidentally discovered today that Function Key F5 does the same thing.
I just recently discovered <F4>. I could never figure out how tutorials were magically getting the dollar signs, and I had to type them in. Once you start using <F4> you'll probably not go back to typing the $ signs. If you make a mistake, you can select the cell, and place your cursor in formal expression at the top and <F4> to change the format.

Quote
You must be logged in to read this quote.

I knew I should have put a few pictures in this discussion. I also miscapitalized "Select". I'm using the newest version of Excel. It has a Data tab on the top ribbon. If you click on that, there's a Remove Duplicates icon.  I'll add a picture to explain.[/quote]
I have a Data tab, but still don't see Remove Duplicates.

Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #18 on: November 05, 2017, 09:55:02 PM »

A tutorial explaining how you generated these would be useful.

Quote
You must be logged in to read this quote.
I never rename source shapefiles. The census files at least are easy to understand what they are. You haven't actually changed the file by renaming the layer, but it might give the impression that you are.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #19 on: November 06, 2017, 01:04:01 AM »

One thing I haven’t tried to figure out yet is what happens if you try to import and join an Excel spreadsheet with multiple sheets. Does it import all the sheets or just the first? I’ll have to test that and get back.
When I did that, it asked which sheets I wanted to use.

I don't have any experience with using xlsx files, since I always used csv files, and I would paste just the columns I wanted to add in the CSV file.

Quote
You must be logged in to read this quote.

That's bizarre. I don't think I modified the Excel ribbon at all. The data tab in both my Excel 2007 and Excel 2016 have a Remove Duplicates button. Try right clicking on the Ribbon, selecting Customize the Ribbon and looking for the "Data Tools" group under Commands Not in the Ribbon. Remove Duplicates is in that group.
[/quote]
OK I found it. It is one of 5 small icons, which you have to hover over.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #20 on: November 06, 2017, 11:04:18 PM »
« Edited: November 07, 2017, 07:55:30 AM by jimrtex »

This Tutorial Illustrates How to Edit Shapefiles. In the previous tutorials we did not change the shapefiles, we only changed how they were rendered. The joins were done in memory. When we save a project, we are saving all the steps that we had performed to reach a particular point.

(1) Start with a new project. You can use an existing project, but it is probably less confusing to start fresh.

(2) Drag the county shapefile cb_2016_us_county_5m.shp into the map area.

(3) Change the On The Fly project to Alaska_Albers_Equal_Area_Conic,

(4) Use the Zoom In tool to focus on Alaska.

We want to keep the original layer, so we make a copy.

(5) Duplicate the county layer by right clicking on the layer, and selecting Duplicate.

(6) Save the (duplicate) layer to disk by right clicking on the layer, and selecting Save As. Give it a name like Alaska_Counties. Be sure to browse to the directory where you are going to save the file and be sure the format is ESRI Shapefile

You might now remove the other layers, so as to reduce confusion. Note that we have not created a layer with Alaska counties only, but rather a duplicate of our original shapefile. You can confirm this by panning to the southeast.

Make sure that your new layer is the current layer (click on it)

(7) Select the area we want to keep.

Using the Select by Expression tool, construct the expression:

    "STATEFP" = 2

You can type this directly, or find the variable STATEFP under Fields and Values and click on it. The double quotes around STATEFP is part of the expression syntax. Click on the = sign, or type it is, and type in 2. The STATEFP for Alaska is 2.

Click on Select, and Alaska should be highlighted in canary yellow. We have not selected Alaska, but rather all the county (equivalents) in Alaska.

If you are doing a different state and don't know the STATEFP, use the Information tool and click somewhere in the state. For example, the STATEFP for Wisconsin is 55. Once you have the information you need, click on the X in the upper right corner of the information panel to eliminate the clutter.

You can also do a graphic selection.

The selection of the counties in the proposed state of Southern California were selected using the Select Features by Freehand.



The selection of the counties in New England were selected using Select Features by Polygon. It also can be done with Select Features using expression.

    "STATEFP" = 9

selects Connecticut. Click on Add to Selection. Then change the 9 to 23 (Maine) and Add to Selection. Repeat for Massachusetts (25), New Hampshire (33), Rhode Island (44), and Vermont (50).



If you make a mistake with your selection, click on Deselect Features and try again.

(8) Invert your selection by selecting Invert Feature Selection (under Select Feature by Expression)

If you are working with Alaska, pan to the southeast to make sure that the continental US is now highlighted in Yellow.

(9) Click on the Pencil Icon in the upper left. This enables editing for the current layer. When a layer is enabled for editing, all the vertices are highlighted with red lines.

Double check that the current layer is the one you want to edit.

(10) Delete the selected areas by pressing Delete on the keyboard (or Edit > Delete Selected)

If this was not what you wanted to happen, click on CTRL-Z to undo the last edit, then click on the the Pencil Icon and go back to Step 7.

(11) Click on the Pencil Icon to complete the editing of the layer. You will be asked if you want to save the edits. Confirm that you do. Note the Pencil Icon is a toggle, it is used to both start and ending editing.

(12) Check that the editing worked. The deleted areas will have disappeared. Click on the Vermont Avenue icon, and scroll down. You should only see the 29 Alaska county (equivalents).

The shapefile that you have just created, can be used in place of the US Counties shapefile that we used in the previous tutorials.
Logged
jimrtex
Atlas Icon
*****
Posts: 11,828
Marshall Islands


« Reply #21 on: November 07, 2017, 10:23:32 AM »

Editing Features With QGIS.

We will be creating an outline map of Alaska, by merging all the counties.

(1) Begin with the Alaska_Counties layer created in the previous tutorial.

(2) Duplicate the layer, by right clicking on the layer, and selecting Duplicate.

(3) Save a copy of the layer, by right clicking on the layer and selecting Save As. Name the layer Alaska_Outline

Be sure to browse to the directory you want to save the shapefile in, and that format is ESRI Shapefile.

To reduce clutter and confusion, remove the other layers (right click on the layer(s) and select Remove). Note you may select multiple layers and remove all simultaneously.

If you are making several edits, it is not necessary to do steps (2) and (3) each time, but it is a good idea to save the starting source layer, and perhaps checkpoints along the way.

(4) Select the features to be merged. Since we are merging all the county (equivalents) in the state, under Select By Expression, Select All Features. The entire state should be yellow.

(5) Click on the Pencil Icon to begin editing. The map will grow red fur.

(6) Edit-> Merge Features. If you do this a lot, you might want to add a tool icon to the Tool Bar.

A table of the selected features will be displayed. Select one. I selected Juneau, but it doesn't really matter. You could pick Mat-Su or some other county (equivalent). The selected area will be orange. Click on Take Attributes from Selected Features, and click on OK. The county boundaries will disappear.

(7) Click on the Pencil Icon to complete editing of the layer. Confirm that you want to save the changes.

(8) Click on Vermont Avenue icon, and you will see that there is now only one row. The information was copied from the table row that we selected in step 6. It is mostly useless. For example the land and water areas are for Juneau, and not the state. For illustration purposes, we will change the name of our feature to Alaska.

(9a) With the table open. Click on the Pencil Icon on the top left edge of the table. This is just a duplicate of the icon on the toolbar.
(9b) Click on the field you want to change (NAME Juneau)
(9c) Type Alaska.
(9d) Click on another field to register that the field has been updated.
(9e) Click on the Pencil Icon.

We can also create some simple displays.

(10a) Right click on the layer select Properties -> Labels.
(10b) In the top pulldown select Show Labels For This Layer
(10c) Select NAME in the next pulldown.
(10d) Click OK.

Alaska is now in the middle of the feature (the entire state). If we had not edited the name in step (9), the label would say Juneau.

We could also have selected the Script-E on the label menu and entered an expression such as 'Not Kansas' as the expression and typed OK. In expressions, literal strings are in single quotes, field names are in double quotes.

We can combine layers.

(11) Drag the Alaska_Counties shapefile into the map, and in the Layers Panel drag it below the Alaska_Outline layer. Since the Alaska_Outlines is on top, it hides the Alaska_Counties. Uncheck the Alaska_Outline layer and we can now see the Alaska_Counties.

To Be Continued


Logged
Pages: [1]  
Jump to:  


Login with username, password and session length

Terms of Service - DMCA Agent and Policy - Privacy Policy and Cookies

Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Page created in 0.07 seconds with 14 queries.