Finding Hidden City Flights using Neo4j & Python 🐍 Part #3

Finding Hidden City Flights using Neo4j & Python 🐍 Part #3


If you’d played around with kiwi flight search for a bit you’ll probably have noticed that the flight search is able to find some pretty darn cheap airfares. Like a transcontinental flight for a bit more than $100 or an international flight in Europe for 10€. In this post, I’ll introduce you to hidden city ticketing which lets you save a lot of money, especially when otherwise no other cheap fares are available. Of course we’ll do it the python and neo4j way!

This post is part of a series of posts. I highly recommend to also check out the previous posts, in case you haven’t been following along from the start.

Disclaimer: Hidden City is a great way to save money, but it also violates the terms of service of some airlines. You, as a passenger and customer are solely responsible for your actions. It’s not the authors intent to support you in doing anything fraudulent. Instead this post should be a proof of concept how neo4j and python can be combined to find low airline fares.

Great if you made it past the disclaimer, let’s get started with the fun part.

What is hidden city ticketing?

Let’s assume you wanna fly from Greensboro, NC (GSO) to New York Newark, NJ (EWR). It would cost you 179 € for a one-way flight. Not bad, but we can get there cheaper…

Screen Shot 2018-05-17 at 18.38.37

If you search for a flight that connects in Newark, but continues somewhere else the price for the same flight drops to 133 €. Instead of booking a flight to Newark you would book one where the final destination is Providence and get off the plane in Newark.

Screen Shot 2018-05-17 at 18.39.21Note: Hidden City Ticketing only works for one-way flights, as the other segments become invalid if you didn’t fly them.

How to store the flight data in Neo4j?

Before we can proceed, I’d like to talk about how I store the flights in Neo4j database.

I am calling this function for every segment of a flight that I search. This function creates the relationship between the date, the origin and destination airport and the flight which is also saved as a node. You may have also noticed that I am also passing a bunch of properties like distance and price.

Let’s look at a simple snippet of code which I use for looping over the segments of the flights and insert them into the database.

The MERGE-Clause is really useful in this case, as it only creates the nodes if they are not already stored in the database. That way I can avoid having hundreds of nodes for December.

The image below shows what the origin-, destination airport, day, month and flight node look like once they’ve been inserted into Neo4j. It shows a number of possible connections from GSO to EWR on 06/06/2018. Prior to creating this image I’ve searched and inserted GSO->EWR and GSO->PVD.

HiddenCity_Neo4j

The Cypher Query for finding these nonstop connections is fairly simple:

You may be wondering what these connections with multiple flight numbers are. These are additional flights (“hidden flights”) I am inserting whenever a search consists out of multiple flights. I basically join the flight numbers and also save the final destination as property. That way I have much more combinations in my database even if I haven’t proactively searched them and I’ll be able to retrieve them using my standard query. Note, we can’t just insert a single relationship between the airports of the segments, as the flights have to be taken in the specific order as ticketed.

How to find them the pythonic way?

You may now be saying how did I know that PVD is a possible destination for our journey which drops the price of our flight and still connects in EWR. At first, I was searching the direct connections departing from EWR from the openflights.org file and then inserting those into the database.

And in the next step, I am basically searching this list of airports as my destination. We could swap the search_flights function for the search_flights_multi, but I deliberately kept it as I get more connections (even if they might not even connect in EWR) which later on might come in handy.

The last step would be to query the database for the cheapest connection, whether it’s hidden city ticket, a connection or just a straight direct flight. We can search the cheapest flight with this cypher query.

tl;dr

This fast and easy run down demonstrates that hidden city flights can efficiently be searched using Python and Neo4j. By using the information from openflights.org, we’ve been able to find some great airfares. There is still a lot of room for improvement, but I’ll cover that in another post. The code of this post can be found on GitHub Gist.

Cheers Alex

PS: Thanks also Philippe Khin for his blog posts and code snippets about a different project, which had a lot of overlap with my own project. These came in handy!

 

Advertisements

One thought on “Finding Hidden City Flights using Neo4j & Python 🐍 Part #3

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.