In [15]:

The code that generates the plots can be viewed on my Github. Enjoy!


It can be extremely challenging to make the correct decision in a world of infinite opinions. Opinions are great and necessary for sparking new ideas, but how do you determine which ideas to pursue and which to let go?

You look at the data. Airbnb's co-founder Brian Chesky once stated in an interview that when Airbnb began to scale, they moved from an intuition to a data informed decision making process. To come up with my project idea, I searched the web for datasets about Lyft and found one on Kaggle containing data from 307,408 Lyft rides in Boston during November 2018.

Looking at data to find ideas

My initial concern with using this dataset is that Boston Lyft rides in November 2018 might not be representative of the entire population of Lyft rides in the US. Perhaps there are fewer rides in November because of the weather or maybe Boston residents have different transportation habits than other cities. However, I do think it can give us a rough idea of what Lyft rides look like, so I decided to use it.

In [8]:
from utils import display_cabRides
distance cab_type time_stamp destination source price surge_multiplier id product_id name
0 0.44 Lyft 1.544950e+12 North Station Haymarket Square 5.0 1.0 424553bb-7174-41ea-aeb4-fe06d4f4b9d7 lyft_line Shared
1 0.44 Lyft 1.543280e+12 North Station Haymarket Square 11.0 1.0 4bd23055-6827-41c6-b23b-3c491f24e74d lyft_premier Lux
2 0.44 Lyft 1.543370e+12 North Station Haymarket Square 7.0 1.0 981a3613-77af-4620-a42a-0c0866077d1e lyft Lyft
3 0.44 Lyft 1.543550e+12 North Station Haymarket Square 26.0 1.0 c2d88af2-d278-4bfd-a8d0-29ca77cc5512 lyft_luxsuv Lux Black XL
4 0.44 Lyft 1.543460e+12 North Station Haymarket Square 9.0 1.0 e0126e1f-8ca9-4f2e-82b3-50505a09db9a lyft_plus Lyft XL

When looking through this dataset, two plots that stood out were the distributions of ride distances and ride times. I was surprised to see the average trip distance only about 2.2 miles with a standard deviation of 1 mile. My intuition suggested that rides would be longer. The distribution of times is about what I was expecting to see. These results got me thinking.

In [9]:
from utils import plot_distributions
- - - - - - - - - - - - - - - - - - - - - - - - -
Mean distance:  2.186975582938635 (miles)
Std distance:  1.0866199074409777 (miles)
- - - - - - - - - - - - - - - - - - - - - - - - -

Opportunity Sizing

If Lyft is focused on selling rides in major cities and the majority of the rides are about 2 miles long with peaking ride times at around 5am and 7pm, who is Lyft currently selling to? My guess: people going to work.

This led me to a new thought. How can Lyft tap into a new market and bring on new customers? Questions I used to help narrow my idea:

  • Why are Lyft rides so short?
  • Is there a market for longer duration Lyft rides?
  • Are there potential customer segments other than people commuting to work in large cities?

Based on the data above and my own experience using Lyft, I believe people are not using Lyft on long distance trips because there are cheaper alternatives. Why would someone pay hundreds of dollars on a Lyft ride from SF to LA if a flight is cheaper and quicker? But what if both the driver and the passenger benefited from arriving at the same destination? Could this be a way to make long distance rideshare more affordable?

Thinking through these questions led me to my product idea:

A carpool marketplace (app feature) for young people to 1) save money as a passenger and 2) make money as a driver on long distance road trips.

Market Research

To determine whether or not there was a market here (and who specifically the market is), I first went to Facebook. I searched for California university rideshare pages and I found that there was already an entire online community of people who privately arrange carpooling on long distance road trips (i.e. San Francisco to LA for $30). Below is a barplot that shows the largest university rideshare Facebook pages in California on the x-axis and their corresponding number of members on the y-axis. These pages are very active and each have on average over 10 posts a day.

In [10]:
from utils import plot_facebook_rideshare

This plot shows me that perhaps the target market could be young adults.

I also searched the web to see if anyone else is already doing this. I didn't find any major companies in the US, but I did come across BlaBlaCar. BlaBlaCar is a French online marketplace for carpooling. They are present in 22 countries and have over 70 million users. They are focused on European companies and are not present in the United States, and they don't have intentions to do so, according to an article. "People in the United States who would like to use BlaBlaCar are out of luck. The company will not expand to the United States."

BlaBlaCar's large popularity across Europe among young adults is another validation metric that this idea could work in the United States.

On top of online research, I also spoke with my friends who are college students. Talking to people face-to-face gave me more ideas and I learned about more problems that people are currently having with long distance road travel. I integrated this feedback into the design of the product, which I will show later.

At the end of my market research, I decided on a target customer: young adults (16-23 yrs) who are primarily concerned with cost when it comes to traveling long distance.

Product Requirements & Features

  • Passenger Needs:

    • Affordable long distance travel
    • Safe & comfortable ride
    • Convenient pick-up/drop-off
  • Driver Needs:

    • Easy to post trips & get verified as a driver
    • Earn enough money to make it worth it
    • Convenient pick-up/drop-off

Product Requirements

Big picture: A marketplace on the Lyft app that connects drivers and passengers to carpool on long distance road trips.

The logistical requirements for each user are outlined below:

  • Driver must be able to:
    • Select a trip date
    • Enter a trip starting/end location & time
    • Enter the number of available seats they would like to sell
    • Choose a price they would like to charge for each seat
    • Write a trip description
  • Passenger must be able to:
    • Select trip date
    • Enter starting/end location
    • Read trip description written by the driver


First, I went on the Lyft app to see how this feature could fit in with the existing environment. Then, I walked through BlaBlaCar's app and noted down features that I liked/disliked. After my initual research for ideas, I began designing my product.

The MVP should have two different layouts: one for the driver, and one for the passenger. Both layouts are visually outlined below as wireframes.

In [11]:
from IPython.display import Image