You are on page 1of 8

MAT353 December 2012 Dr.

Maochao Xu

Variables that affect the Rating of Disc Golf Courses


Sebastian Grobe

Abstract: As disc golf is one of the fastest growing sports in the world, disc golf courses are sprawling up all worldwide. Since course designers are interested in making their courses as attractive as possible to the public, its important to know which variables go into the genetic make up of a highly rated course. By regressing course rating (as determined through DGCourseReview ratings) based on such variables as year designed, multiple tees, proximity to the next course, and whether the baskets are DISCatchers or not, we will determine how statistically significant several variables are on the impact of course rating. 1. Introduction Disc golf is a sport that can be played with very little equipment. All a player needs is a Frisbee and a course. A course consists of tees and baskets, and the various obstacles that lie in between the two. The goal is to get the Frisbee in the basket from the tee in as few throws as possible, just as the goal of golf is to get the ball in the hole in as few strokes as possible. While there are many studies about the economical impact of the traditional ball golf courses and tournaments, we are not aware of comparable studies for disc golf courses. For example, it would be desirable for sponsors and designers of future courses to know what the general key indicators of a successful course are. Does the average player prefer longer courses or simply courses with more holes? Is the brand of the basket significant? Would it be wise to aim for a concentration of several courses within a narrow region or are courses that are not in direct competition with other courses due to their geographical isolation considered more popular? Is the investment of expensive concrete tees justified or are cheaper grass tees sufficient? All of these are important considerations for the design and planning of future courses. Therefore, it would be advantageous if we could identify certain variables that can serve as reliable predictors for the popularity of a course. In order to address these important questions, there are numerous on-line disc golf directories where courses are being evaluated systematically. A careful statistical analysis of theses ratings might give some answers to the questions posed above.

2. Data Gathering Data was gathered from the most respected disc golf directory on the internet (DGCourseReview.com) and courses were sorted so that only permanent courses in the United States with at least five reviews were taken into account. These three queries (permanent, US, 5+ reviews) were made for the following reasons. Only courses that are always playable/permanent (not temporary or practice courses) were chosen because they could skew the ratings. Only US courses were chosen so that we have the same demographic of reviewers, since European/Asian reviewers dont compare with American reviewers. Lastly, as we want credible ratings, we only used courses with a minimum of 5 reviews. This resulted in about 2500 courses. Since it would take too long to gather that data, the list was alphabetized by course name and data was pulled for every 20th course, resulting in 125 observations. Each of these observations had a rating, year designed, number of holes, distance to next course, multiple tees, tee type, number of players, and whether the baskets were DISCatchers or not.

3.

A Look at the Data

The rating variable that were regressing on is continuous from 0 to 5. The distribution of the ratings gathered is pictured to the right. It appears to be normally distributed around 3, which allows us to safely proceed with the regression. The Shapiro-Wilk statistic is .537, validating that the data is normally distributed.

1) The first variable regressed is year designed. The histogram (right) shows the tremendous exponential growth that disc golf is currently experiencing. Below, rating and year designed was graphed and a linear trend line was added, but there seems to be no clear relationship. This finding that newer courses are apparently not improved over the older ones is rather discouraging. This proves even more that systematic studies like the present one are missing and should play an important rule in the design of future courses.

2) The next variable regressed is number of holes. As one can see (right), the most popular number of holes is clearly 18, followed by 9. Like year designed, holes by rating was also graphed, but there is a very significant relationship.

3) The third variable used was distance to the next nearest course. The average distance was 6.4 miles, with a positive skew. This histogram could be consistent with a uniform and uncorrelated homogenous distribution of locations. For instance, it is known that for uniformly distributed random numbers the nearest neighbor spacing distribution is exponential as a function of the spacing. So the present diagram would not be inconsistent with a uniform distribution of golf courses. 4) Multiple tees are the first of the three used dummy variables. Only 37% of courses had more than one tee. 5) The second dummy variable was for cement tees. Of the 125 observations, only 46% had cement tees. The other types (grass, gravel, wood chip, dirt,etc) accounted for the other 54%. 6) The last dummy variable was whether the baskets are DISCatcher brand or not. DISCatcher is currently the number one brand of disc golf baskets and only accounted for 34% of courses. 7) The last variable used in the regression was the number of players who have marked the course as played. The average number of users who have played

the courses I observed was 126. The highest played course had 561 players and is located in Milwaukee, WI. As we can see from the second graph, there is a positive relationship between the rating and number of players.

4.

The Correlation Matrix

Below Ive shown the correlation matrix. Each matrix element is computed as where the averages (barred values) are computed over all 125 courses and and denoted the variance of each variable. It can range from -1 (perfectly anti-correlated) to 1 (perfectly correlated) and can indicate the correlation between all 36 variable pairs.

Course Rating------Course Number---Year Designed----Number of HolesDistance to Next CourseMultiple TeesTee Type-Number of Players-DISCatcher---------

From this matrix, we can immediately make a few inferences. To begin, the strongest correlation at .7 between course number (or course id) and year designed. This makes perfect intuitive sense because the newer a course is; the higher its id number. The second conclusion we can draw is that the strongest correlation that course rating has is with number of holes. This also makes sense because, as Ive shown above, courses with more baskets are more attractive and thus merit higher ratings. The third conclusion I can make is that there may be multicollinearity between number of players and several other variables. For instance, theres a strong negative relationship between year designed and number of players. This relationship comes from the notion that the longer a course is around; the more players have had a chance to play it. A fourth observation might be bad news for manufacturers of the expensive DISCatchers. Apparently the rating of a course seems to be independent of the brand of baskets. 5. The Regression Here are the parameter, Rsquared, and ANOVA tables for the primary regression.

From this table, we can see that all but three variables are significant. These insignificant variables are course number, distance to next course, and DISCatcher. As these are not significant, I removed them in my final regression, increasing the adjusted Rsquared slightly. Before doing my second regression, I wanted to check for heteroskedasticity (not constant variance) so I mapped true ratings to estimated ratings and conceived the following charts.

Thankfully, we have clear homoscedasticity and do not have to do any transformation of the rating variable.

6. The Final Regression When the three variables that werent significant were removed, we arrive at this final table.

Below are the reduced residual charts, which are

very similar to the primary regression residual table. 7. Summary

In summary, its very interesting to see how the different variables have an impact on rating. The final equation I developed is:

Where Y is the course rating, is year designed, is number of holes, stands for multiple tees, is tee type, and is the relationship between number of players and rating. All of these correlations make sense. For , the newer a course is, the better the rating. This could come from the notion that designers have more experience, course equipment is newer, or that new courses dont have their potential flaws exposed yet. For , this relationship is very clear. The more baskets there are, the longer the round takes, the more worthwhile it is to drive there, and therefore merits a better the rating. It also makes sense that multiple tees has a positive relationship with rating. Multiple tees give the course variety, diversity, and more replay value. All of these definitely positively affect rating. Having cement tees also makes sense. Disc golfers definitely prefer having a solid ground to tee off on, rather than something they can slip on like gravel or grass. This is consistent with the corresponding correlation matrix element of 0.3 as discussed above. Lastly, the causality here might be ambiguous, but the correlation makes sense. Good courses draw in more players than bad courses, thus is also logically in tact.

8.

Future Ideas

In the future, it may be interesting to do further regressions on variables like multiple pin positions (which would be a dummy variable), average distance per hole, elevation changes and woodedness, and number of people who favorite the course instead of played it. Additionally, my regression may have been improved if I had opted to regress on number of baskets, and not number of holes, as sometimes there is confusion between multiple tees and additional holes. Additionally, there are other factors that arent even measured yet that could be studied: How often do players loose a disc, does the presence of unreachable areas like fenced areas or lakes that make a course more fun or more discouraging? Would a course be viewed as more interesting and challenging if the par was inflated? Is there an optimum change of elevation? The two extreme cases of a perfectly flat course or one that requires mountain gear are probably both equally unpopular.

An unrelated but equally important study could also be performed to obtain a better demographic about the typical disc golf player. What is their income? What is the average number of players that play a round together? How many rounds do they play a year? All of these factors could be used for a more effective advertisement of this fascinating sport. 9. References [1] DGCourseReview.com (2012)

You might also like