Homework 6
Contents
Homework 6#
Assignment#
Answer the following questions. You have to show codes of every step to get credits not just an answer itself.
Create a
.ipynb
file and start writing your answers in it.You must clearly specify question numbers using markdown in your file.
Name your file as
HW5_<LASTNAME>.ipynb
, e.g.,HW5_CHEN.ipynb
.Submit a presentation, e.g.,
HW5_CHEN.mp4
(< 5 minutes) as well.
1. numpy
Questions#
Q. 1-1#
Describe your answer to the question: “what is NumPy?” in a short paragraph (2-4 sentences).
Q. 1-2#
Import numpy in your script (recall the convention).
Q. 1-3#
Using numpy
functions exclusively, create a numpy.ndarray
that contains
even numbers between 0 and 9.
Q. 1-4#
Using numpy
functions exclusively, create a sequence of numbers with 0.5
constant increment between 0 and 50 (i.e., 0, 0.5, 1, 1.5, …, 49.5, 50).
Q. 1-5#
Create a 4x4 2-d array of random numbers from a uniform distribution
within the range of [0, 1)
.
Q. 1-6#
Import
arcpy
and set up workspace as the class geodatabase, i.e.,class_data.gdb
.Create a variable that references the
crash
feature class.
Q. 1-7#
Retrieve and print out all the field names (as a
list
) in thecrash
feature class.Convert the
crash
feature class to a (Structured) NumPy array with the following fields.crash_fields = ['City', 'Crash_Type', 'Vehicles', 'Non_Motorists', 'Fatalities', 'Injuries', 'Alcohol_Related', 'Distraction_Related', 'Drug_Related', 'Estimated_Damages', 'Weather_Condition', 'Light_Condition', 'Crash_Severity', 'Manner_of_Collision', 'Type_of_Intersection', 'Passengers', 'Bicyclists', 'Pedestrians', 'Citations', 'Property_Dmg_Amt','Vehicle_Dmg_Amt']
Store the structured array with a variable named:
crash_arr
.Find out and print how many records (rows) in
crash_arr
.
Q. 1-8#
Retrieve the first 7 elements of
crash_arr
.Grab (retrieve and store as a variable) the first, fourth, eighth, 12th, 50th, 100th row of the array.
Grab column
Fatalities
from the array.
Q. 1-9#
Write codes to answer following codes:
How many crashes have positive fatalities in this dataset?
What is the maximum value of fatalities in the data?
How many total fatalities in the dataset?
Q. 1-10#
The total Damage to an accident is considered as property damage plus vehicle damage. Create a new array that adds the two corresponding columns.
What is the average value (
mean
) of total damage per crash? What about the standard deviation?Find out which accident has the maximum total damage in this dataset
what was the weather condition and
which city did it occur?
2. pandas
Questions#
Q. 2-1#
Describe your answer to the question: “what is Pandas?” in a short paragraph (2-4 sentences).
Q. 2-2#
import both numpy
and pandas
(recall the convention)
Q. 2-3#
Convert the total damage from a numpy.ndarray
cto a pandas.Series
.
Q. 2-4#
Convert
crash_arr
array to aDataFrame
and name itcrash_df
.Preview the first 5 elements.
Preview the last 8 elements.
Q. 2-5#
Retrieve columns
['Alcohol_Related', 'Drug_Related', 'Injuries']
fromcrash_df
using label indexing.
using position-based indexing.
Assign these columns to a new dataframe variable (call it a name of your choice).
How many crashes are alcohol related? How about drug related?
Driving Under Impact (DUI) is a criminal (not civil) charge (up to 6 months in jail first caught). Both drug or alcohol exceeding certain limit are considered DUI. In this dataset, what is the percentage of DUI-related crashes?
Q. 2-6#
Aggregate
crash_df
by “City” and then calculate the sum?Which city has the largest number of citations?
Q. 2-7#
Aggregate
crash_df
by “Alcohol_Related” and then calculate themean
for “Estimated_Damages”?What do you find out?
Aggregate
crash_df
by “Drug_Related” and then calculated themean
for “Estimated_Damages”?What do you find out?
Was Drug-related or Alcohol-related crashes cost more damage?
Was alcohol-related crash or drug-related crash more fatal? How many people were dead (per 100 crashes) for each category? (write codes)