I gave a talk at State of the Map conference on crowdsourcing. (The slides are posted on my twitter feed , also available on Vimeo)

One observation I took home from the conference is that the state of public data around the word is similar to that in Canada, in the sense that governments and their affiliate entities hold on to the data for as long a possible, despite the fact that doing so, adversely affects the state of their economies and goes against the public good. (According to OSM France “Bano” project, a country loses up to 0.5% of its GDP due to lack of publicly available addressing data. Source)

Crowdsourcing the data is not an optimal solution, in the face of the lack a data feed from its authoritative source, because it results in datasets that contain errors. Still, this seems the only way to open up the data in my view, when the decision makers are convinced that keeping the data closed is better for their budgets (an interesting figure of 0.5 billion pounds was thrown around as the value of a closed post code list by the Royal Mail CEO Moya Greene, in her arguments for keeping the dataset closed. She also happened to be Canadapost’s CEO at the time they started their legal efforts aimed at enforcing CP’s alleged intellectual rights over Canadian postal codes.)

In France on the other hand they don’t have such problems, as they have not yet made the effort to create a post code system like the Canadian or the British ones, hence it is hard to make the half billion dollar argument there, still that does not mean that whatever system they have is open. People from the “Bano” project had to lobby hard to get the list of up to 1000 postal codes created by the French postal service open to the public, and when they actually did it was full of errors. Not only that but the French postal service has 4 different street address datasets (one for regular mail, one for advertising mail,  one for parcels and another one for a purpose I can’t remember now.) All 4 have quality issues, and the 4 different departments that created them do not talk nor cooperate with each other to improve their respective datasets. Funny stories of government inefficiency at the public’s expense. The Economist also wrote a piece on this topic a month ago.

In closing, public data all around the world is at various stages of unavailability because certain people of influence are convinced they are worth a lot of money. Nobody has yet shown how much money they are actually making from licensing this data. I doubt it is half a billion. Or 0.5% of the GDP.

I am certain it is more akin to a hidden tax we all have to pay.


What’s the point?

I was looking at an article about the last world cup semifinal (here) and noticed something strange. I could view all videos embedded in the article except one. A blank screen in that little square had one of those stupid copyright notices you see these days: “This video is not authorized for your location (Canada)”



Well, how do they know my location? The ip address of course. So, I switch to an anonymous browser and watch the silly thing. (I wasn’t even that curious to see that, only went into the extra trouble because of this)  But still, what is the point?

With over 40 buy cheap essay years of experience, we take great pride in our reputation as the preferred drycleaner of the Greater Harrisburg area.

Is the open internet becoming a regional internet, where you may only access certain content based on your physical real world location? I thought the whole idea of the decentralized internet had defeated that whole concept.

IP address location is pretty useful, I’ve recently added it on my geocoding api, but this is absolutely one of the worst possible uses of this technology. It will simply drive more and more people to do what I just did, that is, hide their ip address.


Time determines everything.

If you are smart or dumb (aka a retard), right or wrong (right on time), lucky or unlucky (wrong place, wrong time vs right time).

People complain that “time is short” which is really dumb since no one has ever come up with a measuring stick for time. They also use this word in other meaningless contexts like (“The time is now”, “time is up”, “down time”, “up time”, “a good time”, “a bad time”, “an ok time.”)

When someone complains that they “have no time”, they actually mean that they are not “fast enough”, i.e they are some sort of a retard considering the amount of work that is required of them in a set amount of time.

They do have time, the aforementioned set amount, but they are slow. They are even too slow to admit that, hence the justification of “I have no time.”

A justification requires less time to come by than the truth. That’s why so many of us, slow ones, opt for the first one. Idiots, retards!

An opinion takes even less time, sometimes. Some great things take almost no time at all. They happen in a split second. The rest is just boring repetition by slower people.

Everybody has lots of time. Even the ones that have been recently diagnosed with a fatal disease. By lots of time, I mean time to devote to the things that matter. If you are fast enough.

But, sometimes it is no fun to be fast. Sometimes the best use of your time is to let it go by, and think of something nice, anything but the passing of time.

Eventually everything will come into view.

Give it time.




Thanks for your time.

(Anonymous) Access Denied

I wrote a few weeks ago how you can save money by being anonymous. More specifically about purchasing a plane ticket with, which as one of the many websites in the industry, marks up their prices using the information they get about you from your regular browser.

Well, it seems like now, a couple of weeks later they have blocked access from anonymous browsers.

tor is denied

tor is denied








So, the only other option remains tunneling your connection via your own proxy. Otherwise, if you are getting at their website via an expensive laptop, or any one of your personal information vendors tells them a bit about your buying habits, you are on the hook.

Be anonymous, save money

I was doing some research on plane tickets for a summer trip using several major booking companies (Expedia, Travelocity, etc)

In the course of this research I discovered a little trick that will be beneficial to all, except the aforementioned booking websites.

If you use a browser that makes you anonymous (for example Tor), or just browse via a SSH Tunnel SOCKS Proxy, you will get cheaper prices.

Making the same exact query within a one minute timeframe from a regular browser and an anonymized browser confirms this fact. (the regular browser results were always +$100 more than the ones from the anonymized one)
The writer has followed all the instructions in the order. His writing skills exceeded my expectations. Thanks for writing a good thesis. Great work!
Bottom line, If they know who you are they will charge you more.

PS. It is funny to also note that upon opening the Expedia website from my SSH tunneled browser I got a pop-up window asking me to identify myself via a number of options in exchange for a $25 coupon. The difference in price between the two methods? $134!!. Thanks, but no thanks. Give it a try!

To tweet or not to tweet, that is the problem (or the solution)

The New Yorker ran a piece about twitter bots, those little computer programs that never tire of tweeting on anything, from meaningless jabber to “Click this link” scams. I’ve never used twitter much, nor written any lines of code to do so.

Until today.

I just wrote a little program that connects to the geocoder backend and tweets new postal codes as they are added to the system in real time.

It was about time I added my own meaningless jabber to this already boiling cauldron of laconic expression.

Enjoy! @geolytica

The unwitting foot soldiers of the crowdsourcing army

Screen shot 2013-08-07 at 8.36.19 PMYou may not know it, but you are contributing your daily activity to a major operation aimed at improving the world’s data via crowdsourcing. Your daily movements, phone calls, web searches are providing live feeds to improving today’s most valuable asset, data.

You may be sitting at a restaurant with your phone in your pocket and your phone is reporting back to the mothership that you were at that location for almost 45 minutes, presumably having dinner, another data signal that indicates the restaurant at that location is open at that particular time.

A user of iOS 7 Beta 5 commented on hacker news that you can see a history of all the places you have been to (if you had your phone with you).

A swipe of the finger reveals  (Settings → Privacy → Location Services → System Services → Frequent Locations). In case you were too drunk to remember, you know where you were last week! So does Apple. And who knows who else?

Where is Division No. 6 ??

Seems like most popular social websites are littered with bad location information. Case in point, we often see this on twitter. I doubt any human readers would know where “Division No. 6″ is. Appart from some census people who came up with that, I guess.

Twitter geotagging

Twitter geotagging

there, here & somewhere else

Here are some notes from the Oracle location intelligence conference in Washington, DC.

People here are talking about fun things one can do with location data. It mostly boils down to a) where you have been, b) where you are and c) where you will be, and then using that information to make better decisions. There are case studies from the tech team that helped reelect president Obama, to those who analyzed social media to help you find a better job nearby. Some case studies show how these tools can be used to advance the public good, others just how to improve marketing ROI. There are various buzz words buzzing around, #geosocial, #geovalidation, #geomarketing, #geotagging, #geoanalytics,.. There are many companies, big and small, generating new business by figuring out new ways on how to do what we do, a little bit better.

One thing they all have in common is “geodata” which is sometimes also known as “big data.” Without the availability of geodata, none of these companies would have gotten off the ground in the first place. They are mostly US-based companies and they export their know-how and technology all over the world.

With over 40 buy cheap essay years of experience, we take great pride in our reputation as the preferred drycleaner of the Greater Harrisburg area.

I was talking to a representative from a company offering “cloud geocoding services.” They had some interesting views on their coverage for Canada which is only “partial.” The main reason it seems, is that licensing of geodata in Canada is much more complex than in other countries (The US according to him is the best). In the view of my colleague “Canada is the most difficult developed country when it comes to obtaining & licensing geodata.” The same view was shared by several others.

Canada’s location data are hard to get for small companies, but big companies still get what they need since they can pay the asking price. There is Oracle’s geocoding suite, Google Maps, Microsoft’s Mapinfo, ESRI and others that provide you with all the geocoding you need for Canada.

However, a global geocoding engine can not fully replace a customized local solution. Take google’s geocoder for instance. Google approaches the problem of geocoding like a search problem. They return the highest ranking location from their database. It could work great in certain countries but not all. Because different countries write location information differently. A brazilian colleague pointed out that they had to build their own geocoding engine, which is much more accurate than any of the global geocoders mentioned above.

Case in point. Can Google geocoder accurately find my parent’s house in Ottawa? As of this moment, no. can(1). According to one presenter only 44% of companies are satisfied with results of the geocoding process (of those companies using the well established players), 37% said it is too unreliable, 27% takes too many resources, 14% it is too slow.

This is not just an attempt at self promotion. It is just to state the fact that with even imperfect data one can build a geocoder that is of comparable accuracy to a geocoder utilizing the best geodata money can buy. If you focus on a small localized problem, chances are you might find a better solution than someone trying to solve all such problems with a single algorithm.

Location data are facts, but it costs money to collect those facts. Various companies collect this data in the course of their activities at a great expense (Google, UPS, FedEx, etc). The government spends a lot of taxpayer money for the same goal. But the government can never make that money back by selling the data at high prices. It is better to just give it away for free. The US government does this. It generates more innovation and more companies and more business, which in turn brings in more tax revenues.

Government, is the only big data company in the position to release that data to the public at no cost, then sit back and earn revenue as a result.

I hope our Canadian government realizes this someday.

1)Screen shot 2013-05-21 at 12.44.46 PMScreen shot 2013-05-21 at 12.44.36 PM