Exploratory Data Analysis of Zomato’s Restaurant Dataset | by Hargurjeet

Zomato API Analysis is one of the most useful analysis for foodies who want to taste the best cuisines of every part of the world which lies in their budget. This analysis is also for those who want to find value-for-money restaurants in various parts of the country for cuisines. Additionally, this analysis caters to the needs of people who are striving to get the best cuisine of the country and which locality of that country serves that cuisine with the maximum number of restaurants.♨️

Dataset Details

  • Restaurant Id: Unique id of every restaurant across various cities of the world
  • Restaurant Name: Name of the restaurant
  • Country Code: Country in which restaurant is located
  • City: City in which restaurant is located
  • Address: Address of the restaurant
  • Locality: Location in the city
  • Locality Verbose: Detailed description of the locality -_Longitude: Longitude coordinate of the restaurant’s location
  • Latitude: Latitude coordinate of the restaurant’s location
  • Cuisines: Cuisines offered by the restaurant
  • Average Cost for two: Cost for two people in different currencies 👫
  • Currency: Currency of the country
  • Has Table booking: yes/no
  • Has Online delivery: yes/ no
  • Is delivering: yes/ no
  • Switch to order menu: yes/no
  • Price range: range of price of food
  • Aggregate Rating: Average rating out of 5
  • Rating color: depending upon the average rating color
  • Rating text: text on the basis of rating of rating
  • Votes: Number of ratings casted by people

The dataset can be downloaded from Kaggle. I have build this project on google Colab. The dataset can be downloaded as follows

Reading the data using pandas

Checking if dataset contains any null

Cuisines seem to contain null values. Hence any further analysis involving Cuisines the NaN values has to be considered.

There is another file that is also available along with this dataset

Let us merge both datasets. This will help us to understand the dataset country-wise.

Before we ask questions on the dataset, it would be helpful to understand the restaurant’s geographical spread, understanding the rating, Currency, Online Delivery, City coverage…etc.

List of countries the survey is spread across

The survey seems to have spread across15 countries. This shows that Zomato is a multinational company having actives business in all those countries.

As Zomato is a startup from India hence it makes sense that it has maximum business spread across restaurants in India

The above information helps us to understand the relation between Aggregate rating, color, and text. We conclude the following color assigned to the ratings:

  • Rating 0 — White — Not rated
  • Rating 1.8 to 2.4 — Red — Poor
  • Rating 2.5 to 3.4 — Orange — Average
  • Rating 3.5 to 3.9 — Yellow — Good
  • Rating 4.0 to 4.4 — Green — Very Good
  • Rating 4.5 to 4.9 — Dark Green — Excellent

Let us try to understand the spread of rating across restaurants

Interesting, Maximum restaurants seems to have gone No ratings. Let us check if these restaurants belong to some specific country.

India seems to have maximum unrated restaurants. In India the culture of ordering online food is still gaining momentum hence most of the restaurants are still unrated on Zomato as people might be preferring to visiting the restaurant for a meal.

Country and Currency

Above table display country and the currency they accept. Interestingly four countries seems to be accepting currency in dollars.

Online delivery distribution

Only 25% of restaurants accept online delivery. This data might be biased as we have the maximum number of restaurants listed here are from India. Maybe analysis over city-wise would be more helpful.

Let us try to understand the coverage of the city

The data seems to be skewed towards New Delhi, Gurgaon, and Noida. I see minimal data for other cities. Hence I would do my analysis predominantly on New Delhi.

We’ve already gained several insights about the restaurants present in the survey. Let’s ask some specific questions and try to answer them using data frame operations and visualizations.

Connaught place seems to have a high no of restaurants registered with Zomato, Let us understand the cuisines the top-rated restaurants have to offer

Q2: What kind of Cuisine do these highly-rated restaurants offer

Top-rated restaurants seem to be doing well in the following cuisine

  • North Indian
  • Chinese
  • Italian
  • American

Q3: How many of such restaurants accept online delivery

Apart from Shahdara locality, restaurants in other localities accept online delivery.

Online Delivery seems to be on the higher side in Defence colony and Malviya Nagar

Q4: Understanding the Restaurants Rating localities.

Apart from Malviya Nagar, Defence colony in rest of the locality people seems to prefer visiting the restaurants rather ordering food online.

I would now like to understand the rating of these restaurants that are providing online delivery in Malviya Nagar, Defence colony.

Defence colony seems to have high no of highly rated restaurants but Malviya Nagar seems to done better in terms of Good and Average restaurants.

As restaurants with ‘Poor’ and ‘Not Rated’ is far lesser that ‘Good’, ‘Very Good’ and ‘Excellent’ restaurants. Hence people in these localities prefer online ordering

Q5: Rating VS Cost of dinning

I observe there is no linear relation between price and rating. For instance, Restaurants with good rating (like 4–5) have restaurants with all the price range and spread across the entire X axis

Q6: Location of Highly rated restaurants across New Delhi

The aforementioned four cities represent nearly 65% of the total data available in the dataset. Apart from the highly rated local restaurants, it’d be interesting to know where the known-eateries that are commonplace. The verticles across which these can be located are –

  • Breakfast
  • American Fast Food
  • Ice Creams, Shakes & Desserts

1: Breakfast and Coffee locations

Chaayos outlets are doing better. We need more of those in Delhi. Café coffee day seems to be performing poorly in avg rating. They are required to improve their services.

2: Fast Food Restaurants

3: Ice Cream Parlors

Foreign brands seem to be doing better than the local brands

We’ve drawn many inferences from the survey. Here’s a summary of a few of them:

  • The dataset is skewed towards India and doesn’t represent the complete data of restaurants worldwide.
  • Restaurants rating is categorized in six categories
  1. Not Rated
  2. Average
  3. Good
  4. Very Good
  5. Excellent
  • Connaught Palace has maximum restaurants listed on Zomato but in terms of online delivery acceptance Defence colony and Malviya Nagar seems to be doing better.
  • The top-rated restaurants seem to be getting a better rating on the following cuisine
  1. North Indian
  2. Chinese
  3. American
  4. Italian
  • There is no relation between cost and rating. Some of the best-rated restaurants are low on cost and vice versa.
  • On common Eateries, For Breakfast and Coffee location, Indian restaurants seem to be better rated but for Fast food chain and Ice cream parlors, American restaurants seem to be doing better.

Check out the following resources to learn more about the dataset and tools used in this notebook:

Thank You

The entire notebook can be accessed here. I would like to thank Aakash N S and @Jovian Community for providing all the necessary training through Zero to Pandas course.

Read more here: Source link