The task here is to build a web application that scrapes various websites for data related to the Mission to Mars and displays the information in a single HTML page. The following outlines what I did.
Complete a initial scraping using Jupyter Notebook, BeautifulSoup, Pandas, and Requests/Splinter.
- I created a Jupyter Notebook file called
mission_to_mars.ipynb
and used this to complete all of the scraping and analysis tasks:
- I scraped the NASA Mars News Site and collected the latest News Title and Paragraph Text. Assigned the text to variables that I can reference later.
# Example:
news_title = "NASA's Next Mars Mission to Investigate Interior of Red Planet"
news_p = "Preparation of NASA's next spacecraft to Mars, InSight, has ramped up this summer, on course for launch next May from Vandenberg Air Force Base in central California -- the first interplanetary launch in history from America's West Coast."
-
I visited the url for JPL Featured Space Image here.
-
I used splinter to navigate the site and find the image url for the current Featured Mars Image and assign the url string to a variable called
featured_image_url
.
# Example:
featured_image_url = 'https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA16225_hires.jpg'
-
I visited the Mars Facts webpage here and used Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.
-
I used Pandas to convert the data to a HTML table string.
-
I visited the USGS Astrogeology site here to obtain high resolution images for each of Mar's hemispheres.
-
I set up splinter to click each of the links to the hemispheres in order to find the image url to the full resolution image.
-
Then I saved both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. I created a Python dictionary to store the data using the keys
img_url
andtitle
. -
Next, I appended the dictionary with the image url string and the hemisphere title to a list. This list contain one dictionary for each hemisphere.
# Example:
hemisphere_image_urls = [
{"title": "Valles Marineris Hemisphere", "img_url": "..."},
{"title": "Cerberus Hemisphere", "img_url": "..."},
{"title": "Schiaparelli Hemisphere", "img_url": "..."},
{"title": "Syrtis Major Hemisphere", "img_url": "..."},
]
Use MongoDB with Flask templating to create a new HTML page that displays all of the information that was scraped from the URLs above.
-
I started by converting the Jupyter notebook into a Python script called
scrape_mars.py
with a function calledscrape
that executes all of your scraping code from above and returns one Python dictionary containing all of the scraped data. -
Next, I created a route called
/scrape
that will import yourscrape_mars.py
script and call yourscrape
function.- It stores the return value in Mongo as a Python dictionary.
-
I created a root route
/
that queries the Mongo database and passes the mars data into an HTML template to display the data. -
I created a template HTML file called
index.html
that takes the mars data dictionary and displays all of the data in the appropriate HTML elements.
Thi is how the final product looks like:
Feel free to reach out to me if you have any questions.
E-mail: maercoli2017@gmail.com