Hw10 is ready for grading #9

Mathnstein · 2017-12-05T05:57:56Z

@vincenzocoia @gvdr @ksedivyhaley @JoeyBernhardt @mynamedaike @pgonzaleze @derekcho

hsmohammed · 2017-12-12T21:38:39Z

Hello,

Good job on doing homework 10. I liked what you did on extracting rating data from IMDB website and using the output in generating relationships between the movie year and its rating and between the movie title length and its rating. We can see from your analysis that there good movies produced in almost every year. Good job on using scrapping and using the gsub() function to clean your data. My only comment is that you didn't put a markdown file on your repo instead of using the pdf format. the markdown format is better on github in my opinion. Overall very good job and I was happy to review your work.

Thank you,
Hossameldin Mohammed

emilymistick · 2017-12-14T02:03:35Z

Hi @Mathnstein,

Nice job on this homework!

I was able to download your .Rmd and reproduce the analysis.

Your scraping method is correct and concise as you extract the movie title, ranking, and year from the HTML of the url of interest. You save the data as .csv then load it back in for a small plotting analysis. The analysis is quite simple, but sufficient for the assignment, and the results were interesting. It would be nice also to include a subset of the data table for the reader of the report to see. It would be cool to know what the top few movies are just from the .Rmd report without having to open the .csv.

Overall nice work with web scraping! I've never used gsub() before and will try to remember that option in the future, looks quite useful.

Thanks,
Emily

derekcho · 2017-12-21T21:59:27Z

Hi @Mathnstein, here are some comments about your hw10:

Task(s) selected: Scrape data
Data stored as file ready for downstream analysis: Yes
Basic Exploration: Yes
Reflection: Yes

It isn’t clear what the results of your scraping are. You should show an example of the clean data in a table in the report! However, it looks like the scraping of web data was successful
Interesting plots, but it seems like your conclusions could be incorrect without further exploration. Although there appear to be higher rated movies in recent years, there could also be more movies in recent years too! Not sure why length of a movie title would have any effect on its rating though.
Your assignment hits the required elements, however it feels like you could have dug a little deeper with the scraping. For example, perhaps find other datasets with other variables for these movies like box office earnings. The limited dataset really limits the amount of meaningful exploration that you can do.
Good work overall in STAT 545A and STAT 547M!

Your grade will be emailed to you at a later date.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hw10 is ready for grading #9

Hw10 is ready for grading #9

Mathnstein commented Dec 5, 2017 •

edited

Loading

hsmohammed commented Dec 12, 2017

emilymistick commented Dec 14, 2017

derekcho commented Dec 21, 2017 •

edited

Loading

Hw10 is ready for grading #9

Hw10 is ready for grading #9

Comments

Mathnstein commented Dec 5, 2017 • edited Loading

hsmohammed commented Dec 12, 2017

emilymistick commented Dec 14, 2017

derekcho commented Dec 21, 2017 • edited Loading

Mathnstein commented Dec 5, 2017 •

edited

Loading

derekcho commented Dec 21, 2017 •

edited

Loading