In [1]:
!date
Wed Dec 11 05:04:58 CST 2019

first things first, [here's the link to the data] (http://former.vancouver.ca/projects/burrard/statistics.htm#Measuring).

second things second.

In [5]:
import pandas as pd
In [13]:
bridge_stats = pd.read_csv('vancouver_bridge_stats.csv')
In [14]:
bridge_stats.head()
Out[14]:
date comments northbound pedestrians southbound pedestrians total pedestrians northbound bicycles southbound bicycles total bicycles northbound vehicles southbound vehicles total vehicles
0 June 1, 2009 NaN NaN NaN NaN NaN NaN NaN 31,541 34,811 66,352
1 June 2, 2009 NaN NaN NaN NaN NaN NaN NaN 33,141 35,670 68,811
2 June 3, 2009 NaN NaN NaN NaN NaN NaN NaN 34,045 37,060 71,105
3 June 4, 2009 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 June 5, 2009 NaN NaN NaN NaN NaN NaN NaN 35,393 37,823 73,216

sick nasty, it already filled in all the empty values with NaN

In [33]:
comma_cols = bridge_stats.columns[2:]
In [48]:
def just_wanna_make_you_comma():
    for i in comma_cols:
        bridge_stats[i] = bridge_stats[i].str.replace(',','')
        
# don't write functions at home like this kids
In [50]:
just_wanna_make_you_comma() # hey ya by outkast anyone?
In [51]:
bridge_stats.head()
Out[51]:
date comments northbound pedestrians southbound pedestrians total pedestrians northbound bicycles southbound bicycles total bicycles northbound vehicles southbound vehicles total vehicles
0 June 1, 2009 NaN NaN NaN NaN NaN NaN NaN 31541 34811 66352
1 June 2, 2009 NaN NaN NaN NaN NaN NaN NaN 33141 35670 68811
2 June 3, 2009 NaN NaN NaN NaN NaN NaN NaN 34045 37060 71105
3 June 4, 2009 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 June 5, 2009 NaN NaN NaN NaN NaN NaN NaN 35393 37823 73216
In [52]:
bridge_stats[30:40]
Out[52]:
date comments northbound pedestrians southbound pedestrians total pedestrians northbound bicycles southbound bicycles total bicycles northbound vehicles southbound vehicles total vehicles
30 July 1, 2009 Canada Day 1983 1933 3916 2635 2627 5262 27158 30076 57234
31 July 2, 2009 NaN 1428 1475 2903 2223 2118 4342 32766 35272 68038
32 July 3, 2009 NaN 1291 1392 2683 2236 2081 4317 32228 34517 66745
33 July 4, 2009 NaN 1322 1208 2530 2283 2280 4563 29252 31475 60727
34 July 5, 2009 NaN 1214 1070 2284 1765 1934 3700 NaN NaN NaN
35 July 6, 2009 NaN 868 947 1815 1186 1290 2476 29345 31983 61328
36 July 7, 2009 NaN 752 802 1554 1016 1049 2065 31396 33728 65124
37 July 8, 2009 NaN 1089 1086 2175 1082 1232 2314 32667 35460 68127
38 July 9, 2009 NaN 1421 1268 2689 2223 2295 4519 33187 35615 68802
39 July 10, 2009 NaN 1365 1328 2693 2267 2113 4380 32711 33143 65854

that cheeky function worked.

In [47]:
num_cols = comma_cols # so i remember wtf i'm doing
In [57]:
def it_all_floats_on():
    for i in num_cols:
        bridge_stats[i] = bridge_stats[i].astype(float)
In [58]:
it_all_floats_on() # crafting dirty functions isn't modest. it's just dirty.
In [60]:
bridge_stats['northbound vehicles'][:10].dtype
Out[60]:
dtype('float64')

ayyy i'm that dude writing just the dirtiest code

In [61]:
bridge_stats['northbound pedestrians'].sum()
Out[61]:
524951.0
In [62]:
bridge_stats['southbound pedestrians'].sum()
Out[62]:
583204.0
In [64]:
[bridge_stats['northbound bicycles'].sum(), bridge_stats['southbound bicycles'].sum()]
Out[64]:
[796781.0, 832275.0]
In [66]:
[bridge_stats['northbound vehicles'].sum(),bridge_stats['southbound vehicles'].sum()]
Out[66]:
[14667784.0, 15624684.0]

foot trafficbicycle trafficvehicle traffic

look at my no-manners-no-intro-looking-assssssuming that you'll follow

right along with what I'm doing here. you probably won't. that's okay. also I don't capitilize much anymore. it's a whole keystroke that I'm not willing to even stretch for, and explaining that plus capitilizing my "I"'s is so much extra work I barely even want to talk about it but there ya go. I did it.

FOR THE CUSSIN' CONTEXT

welcome back. been awhile. where have you been? I've been right here. literally the entire time.

the entire time

what's all that mumbo jumbo above?

well. that's data my friends. data on data on data.

lemme explain myself

I heard the best way to learn something is by doing it. Someone smart from the mirror told me that.

So I looked up datasets I could mess around with. I looked all over, up and down, left and right, something that seemed interesting enough to play with and

BOOM

I found one. This one is data collected on people crossing a bridge in Vancouver, mainly to measure bicycle and pedestrian traffic. To see if they ran into each other or whatever. It's all on the site. Go click on that link. Don't make me explain it.

Anyway it's got my three favorite things.

1)Bikes.
2)Vancouver.
3)And bridges.

I made that last one up, there's only one bridge in this situation. but I do love me a cleverly designed bridge used by walkers, cyclists, and cars that don't use fossil fuels.

vancouver

Look at that place.

LOOK AT IT WOULD YA? GORGEOUS RIGHT?

geez settle down and keep reading.

also I don't know which bridge it is up there. that would be interesting to find out.

anyways I was finessing around with the data in Excel because Excel is cool and it can be abbreviated as 'XL' and that's cool too plus it's a pretty easy-to-use-ultra-powerful-data-tool. so keep that in mind.

AND THAT'S WHEN I NOTICED SOMETHING

I noticed that more people travel Southbound than they do Northbound on the bridge, in every category. And not even by a lot. It's like, a teensy eeensy bit everyday and the totals are even an eeensy bit as well.

apparently south is cooler than north in Vancouver.

I mean like...do they stay at a friends house? Do they take another way back? Do they swim?

(There are days that data wasn't collected, they all could have snuck back on those days too. sneaky peeps)

Finding out the real reason why more people end up south of the bridge than the north would probably take longer (and would be victim to a bad game of thrones joke, which I don't watch but do you) and I wanted to do a comprehensive data mine to data visualization that I thought wouldn't take me all night.

I could have showed you all of that with the code and the spreadsheet and it would have been like...

uh yeah...cool numbers dude

so instead I got all graphical and made some sick nasty sweet graphics. because everyone loves beautiful graphics.

I actually wrote the graphs in javascript but then realized it might be too hard to copy all that code over, URLs and everything and also I couldn't figure out for the life of me why I couldn't change the color of the second bar on the second and third graphs ...

so yeah.. whatever I guess. And the pictures work out.

WHAT YOU JUST JUST GOT INTO BY READING THIS

I decided to do another challenge. One of my own making. The #100DaysOfBullroar doesn't really cut it. I will shameless use the popular ones to get people to read my posts though.

if you want to see where I got the data from the link is at the top and the original file is in this repo on github. so be a sleauth and go find it.

alright. go away now.