Hadoop and Big Data
Links
Course Materials
https://legacy.gitbook.com/book/juheck/hadoop-and-big-data/details
Course S3 bucket - No public access
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/
Databricks CE account creation page
https://accounts.cloud.databricks.com/registration.html#signup/community
Spark Demo Notebook
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/demo_spark.dbc
Spark Exercises
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/exercise_spark_rdd.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/exercise_spark_dataframes.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/exercise_spark_dataframes2.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/exercise_spark_dataframes3.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/Classroom-Setup.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/DBTest-Setup-Stub.dbc
Datasets
movielens 100k
- u.data: https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/ml-100k/u.data
- u.item: https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/ml-100k/u.item
- u.user: https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/ml-100k/u.user
- u.genre: https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/ml-100k/u.genre
- u.occupation: https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/ml-100k/u.occupation
crime data Los Angeles
- Crime Data from 2010 to present: https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/crime_data_la/Crime_Data_from_2010_to_Present.csv
- crime-data-la: https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/crime_data_la/crime_data_la.csv
- crime-data-code-name: https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/crime_data_la/crime_data_code_name.csv
- crime-data-area-name: https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/crime_data_la/crime_data_area_name.csv