top of page
National Parks Database  

CIS550: Database and Information Systems — Fall 2020

SQL, JavaScript  

Overview

The goal of this project was to create a website to display information based on overlapping datasets of interest and integrate these datasets into a SQL relational database. I worked in a team of four to accomplish this, and we had to perform tasks such as cleaning data, performing entity resolution (ER), designing schema, writing SQL queries, designing a web application, and optimizing SQL queries. 


We decided to create the website ParkPal, an interactive guide to US National Parks. In the era of COVID-19, it has become more difficult for people to socialize together in an indoor setting, and thus many are turning to outdoor activities. Therefore, we decided to create a website which provides a visualization of US National Parks geographically as well as park recommendations based on a variety of preferences and criteria.

Process

First, we found datasets from which we could create queries from which related to US National Parks. These datasets contained information on park trails and park biodiversity, which gave us data from park popularity and elevation gain to prevalence of different species of animals in parks. After that, we created a relational schema as an ER diagram to connect all the data in the datasets we had. 

ER Diagram from US National Parks

Next, we pre-processed the data in Python using Pandas. Some of the data cleaning included dropping columns we were not going to use, replacing null values with relevant ones, standardizing column names with regards to park names, abbreviating state names, and splitting some columns into multiple columns (such as splitting the geo-location column into separate latitude and longitude columns).

 

We then used Amazon AWS to host our database. We also used Oracle SQL Developer to create DDL statements and to write and optimize queries. 

 

To build the actual web page, we used the MERN stack: MongoDB, Express, React, and NodeJS. We also used libraries like bcrypt for hashing and comparing passwords and Leaflet / Leaflet-React to render the map on the website’s dashboard page. 

 

I was mainly responsible for writing SQL queries and optimizing them, as well as helping clean and process the data. 

Website and Query Overview 

Dashboard: This is the first page users see when they log in to the ParkPal webpage. It has a map with dynamically rendered markers which indicate the locations of national parks in the US. When a marker is clicked, the user can see the name and acreage of the park. 

 

Parks: this page has park recommendations for users. Users can input their location and how far they are willing to travel and they will be shown parks that are within the radius, ranked by closest proximity. Users can also input location, distance willing to travel, and activities they’d like to do at a national park; in return, they will get a list sorted by a score metric of such parks. We developed this score metric as a combination of popularity, difficulty rating, proximity, and presence of desired activities / features. Users can also include wildlife interests (i.e. if you are interested in seeing endangered bird species) and will be recommended parks which contain the most of that type of organism. 

 

Trails: this page has recommendations for hiking trails to users within a specific park. The user can input their current park and their level of hiking expertise to query trails to hike on. Experience levels are denoted as Beginner, Intermediate, and Advanced. We developed a trail difficulty metric which took into account user difficulty ratings, length of trail, and elevation gain of trail, and determined what difficulty level each trail would fall under using this metric. 

 

Wildlife: this page contains biodiversity information of different species living in each of the national parks. If a user wants to see a specific organism, they can find parks by querying on that organism’s common name. They can also input a park, the common name of a desired organism, and a phylogenetic category to be presented all organisms at that park which are closely related to the desired organism. 

 

Login: this page is for a user to log in with a username and corresponding password


Signup: this page is for a user to sign up for an account with ParkPal.

The Presentation

Here is the presentation for our Park Pal website. 

Final Report

This document includes the technical details behind the process used to create this website and the queries integrated into ParkPal. There is information about query runtimes and how we optimized them, as well as the technical challenges we faced, in this report. 

Park Pal Final Report

©2022 by saranya.

bottom of page