Skip to content

Latest commit

 

History

History
57 lines (41 loc) · 2.81 KB

README.md

File metadata and controls

57 lines (41 loc) · 2.81 KB

Data_Science_Interviews_NLP

Project Statement 

Recently I had a random question that is there a dataset containing a variety of data science interview questions & answers? For now, I didn't find any so I decided to create on my own! 🥳

Hence I spent several days to gather over 300 data science interview questions & answers and finally built a dataset large enough to explore.  To be honest, at the beginning, I thought it would be really difficult to cover the majority of types of data science questions. So I decided to gather only the non-coding ones. 

Surprisingly, I found that, after reading hundreds of data science questions from websites such as 

  • Simplilearn
  • Springboard
  • Towards Data Science
  • Edureka
  • Analytics Vidhya 
  • Other Github repos

A conclusion is that there are not as many different interview questions as I expected. After a careful data selection & categorization, I completed a NLP project on the dataset. Hopefully, you will enjoy reading it!

Exploration Results

EDA - Questions EDA - Answers
Alt text Alt text
Method - Questions Method - Answers
Alt text Alt text
Model - Questions Model - Answers
Alt text Alt text
Statistics - Questions Statistics - Answers
Alt text Alt text

Chi_Square Test (Feature Selection)

Alt text

Model Results

MultinomialNB Random Forest Decision Trees
Alt text Alt text Alt text

Data Science Interview Questions Resource: