Homework 6#
Assignment#
Answer the following questions. You have to show codes of every step to get credits not just an answer itself.
Create a
.ipynbfile and start writing your answers in it.You must clearly specify question numbers using markdown in your file.
Name your file as
HW5_<LASTNAME>.ipynb, e.g.,HW5_CHEN.ipynb.Submit a presentation, e.g.,
HW5_CHEN.mp4(< 5 minutes) as well.
1. numpy Questions#
Q. 1-1#
Describe your answer to the question: “what is NumPy?” in a short paragraph (2-4 sentences).
Q. 1-2#
Import numpy in your script (recall the convention).
Q. 1-3#
Using numpy functions exclusively, create a numpy.ndarray that contains
even numbers between 0 and 9.
Q. 1-4#
Using numpy functions exclusively, create a sequence of numbers with 0.5
constant increment between 0 and 50 (i.e., 0, 0.5, 1, 1.5, …, 49.5, 50).
Q. 1-5#
Create a 4x4 2-d array of random numbers from a uniform distribution
within the range of [0, 1).
Q. 1-6#
Import
arcpyand set up workspace as the class geodatabase, i.e.,class_data.gdb.Create a variable that references the
crashfeature class.
Q. 1-7#
Retrieve and print out all the field names (as a
list) in thecrashfeature class.Convert the
crashfeature class to a (Structured) NumPy array with the following fields.crash_fields = ['City', 'Crash_Type', 'Vehicles', 'Non_Motorists', 'Fatalities', 'Injuries', 'Alcohol_Related', 'Distraction_Related', 'Drug_Related', 'Estimated_Damages', 'Weather_Condition', 'Light_Condition', 'Crash_Severity', 'Manner_of_Collision', 'Type_of_Intersection', 'Passengers', 'Bicyclists', 'Pedestrians', 'Citations', 'Property_Dmg_Amt','Vehicle_Dmg_Amt']
Store the structured array with a variable named:
crash_arr.Find out and print how many records (rows) in
crash_arr.
Q. 1-8#
Retrieve the first 7 elements of
crash_arr.Grab (retrieve and store as a variable) the first, fourth, eighth, 12th, 50th, 100th row of the array.
Grab column
Fatalitiesfrom the array.
Q. 1-9#
Write codes to answer following codes:
How many crashes have positive fatalities in this dataset?
What is the maximum value of fatalities in the data?
How many total fatalities in the dataset?
Q. 1-10#
The total Damage to an accident is considered as property damage plus vehicle damage. Create a new array that adds the two corresponding columns.
What is the average value (
mean) of total damage per crash? What about the standard deviation?Find out which accident has the maximum total damage in this dataset
what was the weather condition and
which city did it occur?
2. pandas Questions#
Q. 2-1#
Describe your answer to the question: “what is Pandas?” in a short paragraph (2-4 sentences).
Q. 2-2#
import both numpy and pandas (recall the convention)
Q. 2-3#
Convert the total damage from a numpy.ndarray cto a pandas.Series.
Q. 2-4#
Convert
crash_arrarray to aDataFrameand name itcrash_df.Preview the first 5 elements.
Preview the last 8 elements.
Q. 2-5#
Retrieve columns
['Alcohol_Related', 'Drug_Related', 'Injuries']fromcrash_dfusing label indexing.
using position-based indexing.
Assign these columns to a new dataframe variable (call it a name of your choice).
How many crashes are alcohol related? How about drug related?
Driving Under Impact (DUI) is a criminal (not civil) charge (up to 6 months in jail first caught). Both drug or alcohol exceeding certain limit are considered DUI. In this dataset, what is the percentage of DUI-related crashes?
Q. 2-6#
Aggregate
crash_dfby “City” and then calculate the sum?Which city has the largest number of citations?
Q. 2-7#
Aggregate
crash_dfby “Alcohol_Related” and then calculate themeanfor “Estimated_Damages”?What do you find out?
Aggregate
crash_dfby “Drug_Related” and then calculated themeanfor “Estimated_Damages”?What do you find out?
Was Drug-related or Alcohol-related crashes cost more damage?
Was alcohol-related crash or drug-related crash more fatal? How many people were dead (per 100 crashes) for each category? (write codes)