#!/usr/local/bin/php
Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /cgihome/cis520/html/dynamic/2016/wiki/pmwiki.php on line 691
Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /cgihome/cis520/html/dynamic/2016/wiki/pmwiki.php on line 694
Warning: Use of undefined constant MathJaxInlineCallback - assumed 'MathJaxInlineCallback' (this will throw an Error in a future version of PHP) in /cgihome/cis520/html/dynamic/2016/wiki/cookbook/MathJax.php on line 84
Warning: Use of undefined constant MathJaxEquationCallback - assumed 'MathJaxEquationCallback' (this will throw an Error in a future version of PHP) in /cgihome/cis520/html/dynamic/2016/wiki/cookbook/MathJax.php on line 88
Warning: Use of undefined constant MathJaxLatexeqrefCallback - assumed 'MathJaxLatexeqrefCallback' (this will throw an Error in a future version of PHP) in /cgihome/cis520/html/dynamic/2016/wiki/cookbook/MathJax.php on line 94
Project /
Restaurant RatingsOn this page… (hide) OverviewFor this project, you will be developing a system for real estate price estimation: predicting the price of houses given their advertisement. The dataset is taken from over 40,000 real examples from 7 cities: Boston, Chicago, LA,Miami, NYC, Philly, Vegas. You will be given a training dataset of 20,311 labeled training samples and tested on around 20,307 testing samples. The features of the dataset are binary indicators of the existence of frequent uni-grams and bi-grams in the ads. Your goal is to predict the (logarithm of) price for the 20,307 test samples. The format of the project is a competition, with live leaderboards (see below for more details). Project Rules and RequirementsRules and Policies
Overall requirementsThe project is broken down into a series of checkpoints. There are four mandatory checkpoints (Nov. 20th, Nov. 21st, Dec. 3rd, and Dec. 6th). The final writeup is due Dec. 11th. The leaderboards will be operating continuously so you can monitor your progress against other teams and towards the score based checkpoints. All mandatory deadlines are midnight. So, the deadline “Nov. 20th” means you can submit anytime before the 20th becomes the 21st.
EvaluationError metricYour predictions will be evaluated based on their root mean squared error (RMSE). Your code should produce an Nx1 vector of rating predictions. If each element {$\hat{y}_{i} $} is the prediction of the rating of the {$i^{th}$} review and {$y_i$} is the true label then RMSE is: {$ \mbox{Root Mean Squared Error} = \sqrt{\frac{1}{N}\sum_{i=1}^N (y_i - \hat{y}_i)^2} $} Requirements for Each CheckpointFor the second and third checkpoints, you must submit to the leaderboard(s). For the final checkpoint, you must submit ALL of your code via turnin to the correct project folder. Make sure that you submit any code that you used in any way to train and evaluate your method. We will be opening up an autograder that will check the validity of your code to ensure that we’ll be able to evaluate it at the end. Detailed InstructionsDownload the starter kitYou can download the starter kit here: http://alliance.seas.upenn.edu/~cis520/fall14/project_kit.zip Inside the code directory, Register your team nameBefore you can get results on the leaderboard, you need to submit your team name. Everyone on your team is required to do this. Simply create a text file on $ echo "My Team Name" > group.txt $ turnin -c cis520 -p proj_groups group.txt This Submit to the leaderboardTo submit to the leaderboard, you should submit the file Once you have your
Your team can submit once every 5 hours, so use your submissions wisely. Your submission will be checked against the reference solutions and you will get your score back via email. This score will also be posted to the leaderboard so everyone can see how awesome you are. You can view the current leaderboard here: http://www.seas.upenn.edu/~cis520/fall14/leaderboard.html Submit your code for the final checkpoint or to test correctnessThe file The time constraint for initializing your model(s) is 3 minutes. The file The time constraint for making predictions on 20,307 test samples is 10 minutes. You must submit your code for the final checkpoint. You can do so with the following:
You will receive feedback from the autograder, exactly like the homework. The feedback you will get from the autograder is whether the model is initialized within 3 minutes, whether the final prediction code runs within 10 minutes for 20,307 test samples and whether the submission size is less than 50 Mb. You will not get feedback about your RMSE performance on the test set. The final rankings will be released on the day of the prize ceremony, Dec. 8. |