The aim of this assignment is to give you more experience designing and implementing a concurrent programs to process a reasonably large amount of data. The Problem Scenario Your success in impressing the CTO of Blackler-Whistcomb with your work in Assignment 7 leads to you soon becoming employed as a consultant. Your first task in this new role is to explore how you can start to persist the data that is produced by ski lift data analysis and start to make this available to potentially a large amount of skiers concurrently through a web site or mobile app. In this assignment you will design and implement a simple data model for your analyzed and raw data, persist the data, and explore mechanisms to provide rapid response times to queries.. We will use the same data as assignment 7. You can download the data set from here.
Step 1 – Store the Data Assignment 7 produced results stored in csv files. We’re going to build on those, and add to them of course (as it would be far too easy if we didn’t!). You will need to persist the following data: Lift Data: The raw ski lift data you get from the csv file (liftrides.dat) For all 40K skiers: the number of lift rides they do each day, and the total amount of vertical metres they ski (equal to the total vertical rise of all lifts they ride) (skier.dat) For each of the 40 lifts: the total number of lift rides in this day (lifts.dat) For each hour in the ski day: the top 10 most popular lifts, ordered from the lift with the most rides to the lift with the least. (hours.dat) You will need to design these files so they can be randomly accessed. For example, if i want to see the data for skier #27894, I should be able to read this from disk with the same access time as skier #1. Design and implement a set of classes that can create and access these files. The specific queries you will need to process are described in the next step. Take these into account when you are describing your data model. We would suggest you modify one of your submissions from Assignment 7 to write these files. In addition, you will be required to store the number of times each skier views their summary data. When you create the skier data set, initialize this value to 0 for every skier. Whenever you get a Skier Summary Query (see below), add one to this field and return the new value as part of the results. This is the only update (write) you need to satisfy. All other queries are read only. Remind you of the readers-writers problem? We hope so. As you only have data for one ski day (oddly, Day 2), you can ignore the day completely. Step 2 – Queries Once the data has been stored in files and indexed, create a program that will query this data. You will be supplied with test data that contains sample queries in the format: