In this homework, you will be performing linear regression on a data set of your choice. Perform the following for this homework: 1. Find a data set of your choice from a valid source. You may use a previously used data set. If you have categorical data, you may want to create dummy variables. 2. Split up your data set into a training set and a scoring set. Rename the data sets appropriately. The scoring data set should not include the column that you are trying to predict. 3. Import both data sets into RapidMiner and check the ranges on all attributes. If some observations in the scoring data set for an attribute lie below or above the training data set’s lower or upper bound for that respective attribute, then remove these observations that are outside this range. Take a screenshot of the loaded data sets. 4. Set the role of the attribute in the training stream that you are trying to predict as a label. 5. Perform linear regression by adding the “Linear Regression” operator to the training stream and adding the “Apply Model” operator to connect the training stream to the scoring stream. Take a screenshot of the final process stream. 6. Run the model and take screenshots of both the linear regression results (i.e., table with regression coefficients) and the results of predictions made on the scoring data set. Evaluate and interpret your results. Examine your attribute coefficients and the predictions made in the scoring data set. In your interpretation of results, you should include answers to the following questions: a) Which attributes have the greatest weight? b) What would the resulting mathematical formula be for the regression line? c) Were any attributes dropped from the data set as non-predictors? If so, which ones and why do you think they weren’t effective predictors? d) What can you conclude from the predictions made? Submission Instructions: Please type up your homework using the homework template posted on Blackboard under Assignments. You should include at least four screenshots: (1) data set loaded in RapidMiner, (2) final process stream, (3) linear regression results (i.e., table with regression coefficients), and (4) results of predictions made on the scoring data set. Remember to interpret your results and answer all questions above in step 6. Only a softcopy submission is needed through the assignment link posted on Blackboard.
In this homework, you will be performing linear regression on a data set of your choice.
We offer the best custom writing paper services. We have answered this question before and we can also do it for you.
GET STARTED TODAY AND GET A 20% DISCOUNT coupon code DISC20
We offer the bestcustom writing paper services. We have done this question before, we can also do it for you.
Why Choose Us
- 100% non-plagiarized Papers
- 24/7 /365 Service Available
- Affordable Prices
- Any Paper, Urgency, and Subject
- Will complete your papers in 6 hours
- On-time Delivery
- Money-back and Privacy guarantees
- Unlimited Amendments upon request
- Satisfaction guarantee
How it Works
- Click on the “Place Order” tab at the top menu or “Order Now” icon at the bottom and a new page will appear with an order form to be filled.
- Fill in your paper’s requirements in the "PAPER DETAILS" section.
- Fill in your paper’s academic level, deadline, and the required number of pages from the drop-down menus.
- Click “CREATE ACCOUNT & SIGN IN” to enter your registration details and get an account with us for record-keeping and then, click on “PROCEED TO CHECKOUT” at the bottom of the page.
- From there, the payment sections will show, follow the guided payment process and your order will be available for our writing team to work on it.