Forum: hoop
Page 9774
Subject: TSN Price Change formula


  Posted by: Guru - [330592710] Mon, Nov 03, 2003, 12:41

What is the underlying TSN price change formula?

This discussion has started in the Arroyo thread, but really should command a separate thread, if for no other reason than that we'll never find it in the archives if it's listed under an "Arroyo" title.

I did solve the formula for baseball back in 1999. Here is the report: link,

Since then, there have been some changes, but as each was introduced, it has generally been diagnosed as an incremental adjustment on top of the original formula.

1. Prices became daily, rather than weekly. The same general formula was used, with only the general sensitivity factor(s) changed. Over the years since daily repricing has been the standard, there have been tweaks to the sensitivity, ratcheting up or down. Usually, this factor remains constant throughout a season, although there was at least one instance where this was clearly not true.

2. Gravity was added. This seems to be a simple deduction for players that qualify. The unknown details are the criteria for qualifications for gravity, and whether "partial gravity" exists for those who are on the cusp.

3. A dampening factor was added, which has the effect of limiting the gain or loss for the most extreme daily changes. When this was first implemented and studied, the general assessment was that the sensitivity factors were applied to the square roots of the buys and sells, rather than the raw numbers.

I think these were the signficant changes. I've seen nothing that causes me to suspect that the underlying formula isn't essentially the same as in 1999 (other than the changes noted above).

A consequence of this is that pure ownership data - even if totally accurate - is not sufficient to fully reverse engineer the formula. You need to have the underlying buys and sells, since the sensitivity of a buy tends to be less than the sensitivity of a sell, especially when new team formation is still occurring.

I'm sure there are other threads (perhaps more likely in the baseball forum) dealing with past attempts to work out the details. I'll leave it to others if they want to ferret any of them out.
 
1Sludge
      Sustainer
      ID: 25919714
      Mon, Nov 03, 2003, 12:47
Hmm... I recall playing around with it (with Richard et al.) in the baseball forum a couple years back. I don't think that the linear formula worked to a satisfactory level. Still lots of outliers (not even counting the few infamous ones). Hmmm...
 
2RecycledSpinalFluid
      Dude
      ID: 367232615
      Mon, Nov 03, 2003, 12:53
If you need the underlying buys and sells, I have that too. My database is getting fat in a hurry. Either time for a larger hard drive, or purge the data after two weeks as I had planned.
 
3JCS
      Sustainer
      ID: 20102934
      Mon, Nov 03, 2003, 12:57
Just reposting some stuff that was in Arroyo's thread that I think is relevant to this thread, in case :

----------------------------------------------

110 JCS
Sustainer
ID: 20102934
Mon, Nov 03, 2003, 12:22 -- DISCLAIMER --
Sorry for this post which has nothing to do with Arroyo and which probably won't interest 99% of the readers.
-- END OF DISCLAIMER --

I'm not up-to-date to what's being done already on this forum. In fact, I didn't know that some people already tried to discover that formula, maybe you can tell me more about what has been done and the validity of the results.

I brought the ML argument because it's my main theme of studies this year and I thought it fits the problem perfectly. We have the etiqueted data :
* x = global activity on player A compared to global activity (Example : net result = +78 buys on a global 1535 trades number)
* f(x) = price change for player A.

The learning data could be all the data (-> the couples (x,f(x))) we have since RSF provides us the ownership numbers. Of course, the more data we have, the better the results of our program will be. I belive right now we only have 4 or 5 days of ownership numbers, so the results shouldn't be spectacular.

Next, we have to find an algorithm that works well with continuous data (as opposed to discrete data) and which also stays fairly efficient with errors in the training data (RSF himself doesn't guarantee his ownership numbers to be 100% correct). Since this case is actually the most common case, it should be fairly easy to choose an algorithm.

Last, we have to have an idea of the target function f to know how to represent it. Here, based on my own experience in this game, we have to represent the target function as a logarithmic function, something like f(x) = a + (b*ln(c*x)).

Once we have all this, the program should find the a, b, and c parameters pretty accurately. Then it's pretty obvious the value of f(x) obtained is being rounded to the closest multiple of 10 (f(x) = -17.58 becomes -20k in price changes)

Trial-and-error is an algorithm that can be used I guess, its main problem to me being that it could stop on a local extremum. I believe other algorithms can lead us to the global extremum, which means to the most accurate description of the target function (that is, the repricing formula).

But maybe I'm getting all carried away and this formula can be found (has already been found?) by much simpler ways.



113 Sludge
Sustainer
ID: 25919714
Mon, Nov 03, 2003, 12:36 JCS -

If you're going to give a specific function that you think describes the model (f(x) = a + b*ln(x), e.g. - note that the "c" is not necessary as ln(c*x) = ln(c)+ln(x), and ln(c) can just be absorbed by the intercept), then just use linear regression techniques. So long as there's a stochastic component (which there will always be without complete trade data), it's the simplest thing to do that I can think of. Even if you have complete data, and you nail the form of the function just right, linear regression will do the trick. If you think it's nonlinear, there's also plenty of software packages that can handle that as well.

By trial-and-error, I meant just trying several classes of models and picking the one that does the best job (but not necessarily the best coefficient of determination - R^2).

Passing on a bit of advice that my modeling professor passed on to us while I was in grad school: Be careful that when you learn how to use a hammer, everything doesn't start to look like a nail. At that point, he went on to describe how he once tried to use a degree 20 polynomial for prediction. :)
 
4Guru
      ID: 330592710
      Mon, Nov 03, 2003, 12:57
There were certainly a few infamous outliers. Maybe even "outliars" is a better term.

And while there were also a few unexplained changes they tended to be isolated instances. Further, there was never any plausible rationale to explain these outliers. Thus, my conclusion then that the underlying data for those instances was bungled (either theirs or ours). There was certainly at least one "rogue" programmer at Small World at the time.

Casting aside those few anomalies, I think the linear model explained things quite well.
 
5Guru
      ID: 330592710
      Mon, Nov 03, 2003, 13:00
[no plausible explanation, other than that there was a random number added to the occasional price change - which might have been there as an attempt to create confuuon.]
 
6 Sludge
      Sustainer
      ID: 25919714
      Mon, Nov 03, 2003, 13:04
RSF -

How large is the file? After zipping? If it's under 2-3 meg, feel free to email it to me. Does it have the corresponding price changes?
 
7RecycledSpinalFluid
      Dude
      ID: 127421915
      Mon, Nov 03, 2003, 13:58
Sludge, the last 3 days worth of data zipped up comes to 4.2 MB (13.2 MB unzipped). I can break it up to individual days if you want.
 
8Smartone @work
      ID: 22815228
      Mon, Nov 03, 2003, 14:14
looking at the names appearing at the bottom of the ownership table, I started wondering whether TSN has added bogus teams to inflate the total number of teams slightly above 5000 (the number of teams they had in the past 2 years) - whereas the players in these teams were selected randomly, and the ownership/price algorithm ignores them. Therefore, RSF/others, it would be really interesting to look at teams with "suspected" activities and see whether they are for real (with trading activity, without a common topic such as "ex-Euro players, undrafted rookies etc) -- and perhaps to see in which leagues they are in.

just my 1.5 cents
 
9RecycledSpinalFluid
      Dude
      ID: 127421915
      Mon, Nov 03, 2003, 14:17
I know my data contains "dummy" teams, but I do not think those teams show up in either the "Total Number Of Teams" that you are competing against, nor do they effect player pricings, but that is just a guess on my part though. If it did, there would be some serious "game integrity" issues they surely don't want to deal with.
 
10Astade
      ID: 214361313
      Mon, Nov 03, 2003, 14:23
Smartone, that could be possible and I thought about that.

As I mentioned in the Player ownership thread a couple of days ago, my main concern (numbers-wise) is formula for the 'Losers'. Just an amateur, but some of those numbers just don't add up.

If I were to theorize, I think alot of those 'bogus' names on rosters are the result of some managers using large chunks of money on Studs and then finding themselves with 550K and choosing a random player for fun. If there are quite a few managers who didn't research much for there initial draft, it seems plausible. Then again, I don't know the demographics for TSN Ultimate players, and I always assumed they were the 'cream of the crop' (excluding me;)).

If these 'dud' players are still on rosters 2-3 weeks later (when respectable 500K players are out there), then I think we should be concerned
 
11Sludge
      Sustainer
      ID: 25919714
      Mon, Nov 03, 2003, 15:05
RSF -

I can swing 4.2 meg I believe. If you've got the bandwidth, send it over.
 
12Sludge
      Sustainer
      ID: 25919714
      Tue, Nov 04, 2003, 17:51
For anyone who's interested, I spent the better part of the day compiling all the data that should be relevant for the 11-1 and 11-2 price changes. I haven't even had a chance to try and start figuring out the formula yet, that's on the agenda for tomorrow.

BuysSells.xls

I did some spot checking, and everything is consistent as far as I can tell.
 
13Sludge
      Sustainer
      ID: 25919714
      Tue, Nov 04, 2003, 17:52
Oh, yeah, almost forgot. A big "thank you" to RSF and Guru for the data.
 
14T-Mac
      ID: 289232719
      Tue, Nov 04, 2003, 19:46
ure missing the most important columns. % of total buys, and % of total sales.

 
15Sludge
      Sustainer
      ID: 24981818
      Tue, Nov 04, 2003, 20:15
No I'm not.
 
16T-Mac
      ID: 289232719
      Tue, Nov 04, 2003, 23:10
no anti gravity Guru. Looking at the data ppl have lost money dispite ppl buying into them. I would assume you need at least a certain # to overcome gravity, isn't that anti-gravity?
 
17Guru
      ID: 330592710
      Tue, Nov 04, 2003, 23:15
"Anti-gravity" would allow heavily owned players to continue to increase in value in spite of the lack of buys. In effect, if a player is heavily owned and not subject to selling, then his price would continue to rise at some modest pace.

I have no reason to believe that this has been implemented, although it has been discussed before - even by TSN - as something they have considered.
 
18Punk42AE
      ID: 36635522
      Tue, Nov 04, 2003, 23:23
GO-GO Griffey -660K
 
19T-Mac
      ID: 289232719
      Tue, Nov 04, 2003, 23:24
"First, we ask, "Do a lot of our owners own this player?" Then we ask, "Was this player traded frequently during the last pricing period?" If the answer to both questions is "no", then the player's price drops by a maximum predetermined
amount that varies from game to game. If the answer to both questions is "yes", then this aspect of our repricing system does not drop the player's price at all. If the answer to one question is "yes" but the answer to the other is "no", then the player's price drops by an intermediate amount."

having high ownership and no activity looks like it fits right under this yes to one and no to the other category. So you are saying as long as his ownership is high he can't go down? unless sold?

btw im not disagreeing with you, i just want to find out what really happens cuz i obviously don't know.


 
20T-Mac
      ID: 289232719
      Tue, Nov 04, 2003, 23:31
the way I am thinking gravity works is like this:

Lets say the bar is at 0.5-0.8% buys(Just some random #). If a certain player hits 0.5% his price change will be zero, above 0.8% he will gain money at a given formula. if his price change is at 0.3% he loses 10k, 0.2% he loses 20k, and anything lower than that up to zero he loses 30k.
 
21RecycledSpinalFluid
      Dude
      ID: 204401122
      Tue, Nov 04, 2003, 23:32
If ownership is above a certain % (gravity threshold):
1.) No ownership gains & No Losses = No price change
2.) Gains = Price increase
3.) Losses = Price decrease

If ownership is below a certain % (gravity threshold):
1.) No ownership gains & No losses = -30K gravity
2.) Gains = Price increase
3.) Losses = Price decrease
 
22RecycledSpinalFluid
      Dude
      ID: 204401122
      Tue, Nov 04, 2003, 23:34
Gravity is based on total ownership, not daily movement.
 
25Guru
      ID: 330592710
      Wed, Nov 05, 2003, 08:47
RSF - your formula [21] is what I would expect to be the case. Seems like the simplest and most intuitive.

T-Mac[19] "So you are saying as long as his ownership is high he can't go down? unless sold?"

Yes. As long as any player is owned by enough teams to be immune to gravity, then the only thing that will influence the price (up or down) is trades.

In effect, there are two ways for a player's price to drop: gravity, and sells. There is one way for a price to go up: buys.

I haven't fiddled with this current data, however.
 
26Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 10:33
Some observations...

1) Gravity seems to turn off at around 100-110 ownership level. (Sort the data by price change then ownership levels to see this.)

2) Gravity seems to be the way T-Mac suggested in #20, but it's hard to pin down exactly with only the two days' worth of data. In that same sort, if you go down to the block of four players with -20 price gain starting with Calbert Cheaney and ending with Rasheed Wallace, the following jumps out:

Calbert Cheaney, ownership 27, 12 buys, 0 sells, -20 price gain.

Kelvin Cato, ownership 51, 17 buys, 0 sells, -20 price gain.

Mike James, ownership 75, 19 buys, 2 sells, -20 price gain.

Rasheed Wallace, ownership 75, 20 buys, 0 sells, -20 price gain.

This limited data suggests to me that a constant of -30 is applied to any gains or losses for players with an ownership below a certain level, and that the constant disappears after they rise above that level. Unfortunately, there were no players with low ownership levels that had a gain of -10.
 
27Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 10:49
You can also see this by looking at the -40 price gains and comparing the heavily owned players to the lightly owned players.
 
28Guru
      ID: 330592710
      Wed, Nov 05, 2003, 11:35
Here are a few examples of players on the gravity cusp for Nov. 1.

LaPhonso Ellis
Prior ownership: 107 (teams)
Buys: 0
Sells: 1
Ending ownership: 106
In gravity? Yes

This suggests that ownership on 106 or 107 teams is low enough to induce gravity. This represents roughly 2% of all teams.

Michael Finley
Prior ownership: 119
Buys: 0
Sells: 4
Ending ownership: 115
In gravity? No

Finley seems to be out of gravity with slightly higher ownership and little trading.

Lonnie Baxter
Prior ownership: 117
Buys: 2
Sells: 28
Ending ownership: 91
In gravity? Yes

Baxter’s ending ownership is low enough for gravity. It is unclear whether his prior ownership would have qualified as well.

Francisco Elson
Prior ownership: 107 (note: spreadsheet shows zero, which is obviously incorrect)
Buys: 11
Sells: 7
Ending ownership: 111
In gravity? No

Elson started the day with the same ownership as Ellis. It is unclear whether he escaped because of his trading activity, or whether his ending ownership was just over the threshold.

Joe Smith
Prior ownership: 96
Buys: 8
Sells: 1
Ending ownership: 103
In gravity? No

This is an interesting one. His before and after ownership are both less than LaPhonso Ellis, yet Smith appears to be out of gravity. His trade activity seems to have been sufficient to overcome the ownership barrier.
 
29Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 12:17
If my conjecture is right, then the formula should look something like this:

Price Change = -30*I(O[N)+f(B-S), where

"[" means "less than", O is the ownership, N is the cutoff for gravity, B is the "correct" measure of buys, and S is the "correct" measure of sells. I(O[N] is the indicator function which is 1 if O is less than N (subject to gravity) and is 0 otherwise (not subject to gravity). f is some monotone increasing function with f(B-S) = 0 if B-S = 0.

So, if the price change is plotted versus B-S, a nice curve should fall out, where those players that are affected by gravity should follow the same function, but should be below the bulk of the data. Kinda like this:



Three observations that are immediate:
1) If log(%Buys+1)-log(%Sells+1) isn't B-S, then it's darn close.
2) It's not linear.
3) The gravity players fall out quite nicely.

This looks very much like a logistic function.
 
30Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 12:28
Marcus Banks, 11-2 is another interesting one.

112 Ownership prior/100 post, 12 sells, 0 buys, -10 price change.

BTW, that graph excludes 500K players.
 
31Guru
      ID: 330592710
      Wed, Nov 05, 2003, 12:35
A few more obvious examples from Nov. 2 which show that lightly owned players with some trading activity are exempts from gravity:


DeShawn Stevenson
Prior ownership: 49
Buys: 13
Sells: 0
Ending ownership: 62
In gravity? No


Kelvin Cato
Prior ownership: 68
Buys: 14
Sells: 0
Ending ownership: 82
In gravity? No

In both cases, the before and after ownership levels seem to be easily within gravity bounds. But each player has a price change of +$10K, which seems consistent with the buy activity only.

Thus, I think that the gravity function is still one with a value of either ‘0’ or ‘1’, but depends not only on ownership, but also on current trading activity. It is only a ‘1’ if ownership is low AND buying activity is also very low.
 
32Guru
      ID: 330592710
      Wed, Nov 05, 2003, 12:37
... and maybe it is total trading activity, rather than only buying, that is factored in.

Banks' initial ownership might be enough to keep him out of gravity. If not, then perhaps his trading (selling) activity is the reason.
 
33Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 12:59
Well, removing as many of the gravity players that I could identify, I used some handy-dandy nonlinear regression procedures to come up with:

Price Change = 325.7 * (0.5 - 1/(1+exp(16.02*x))),

where x = log(%Buys+1) - log(%Sells+1).

Of course, this reduces to a function of the ratio of (%Buys+1)/(%Sells+1), but I think the above representation is best.

Here's another plot, this time with the estimated curve drawn over the data.



Not a perfect fit, but not bad either. So if a player is subject to gravity, just knock 30K off of the price change that this curve would give you.
 
34Guru
      ID: 330592710
      Wed, Nov 05, 2003, 13:04
Sludge, part of the error is due to rounding, as TSN rounds all price changes in $10,000 increments. That most likely explains the width of the "shoelaces".

Does that formula (and/ior graph) apply to both days? Or just one of them? (I assume that the %Buy and %Sell values are determined based upon each single day's activity.)
 
35Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 13:10
That's both days, and the %Buy = Buys / Total Buys for that Day. Similarly for the %Sells.

I was actually in the middle of rounding the predictions to the nearest 10 when I refreshed the page. After doing that, 17 of the 185 observations (9%) were off by 10k in one direction or the other, and none were off by more than 10k.
 
36Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 13:24
BTW, I fixed Elson on the worksheet. Anyone see any others that don't look right?
 
37Ender
      Donor
      ID: 459217
      Wed, Nov 05, 2003, 13:33
2 things, Sludge:

I am curious why you think the difference is better than the ratio given that they are both equivalent. This just the musing a mathemtician and really holds no bearing on the formula or the task at hand.

2nd, I think your formula begs the questions:
Why 325.7? and Why 16.02?

I don't mean why from your perspective bcause the data says what it says. I mean if this is in fact the formula then why whould those 2 numbers be chosen for the formula? Any theories? They look like random numbers to me. Am I missing any kind of significance?
 
38Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 13:44
Ender -

The function just looks nicer to me if you don't use the form that expresses it in terms of the ratio, that's all.

Why 325.7 and why 16.02? Keep in mind that those are just estimates. If this is the form of the formula that TSN uses, then those are just estimates of the actual values that they use in their formula. The significance of the 325 is important in that when it is divided by 2, it defines the maximum loss and gain possible (+/- 160). The 16.02 is just a scaling factor that determines how fast or slow that limit is approached.

Now, the usual cautions apply here. Is +/- 160 the maximum loss or gain? I don't know. This really goes to the fact that I haven't necessarily identified the formula that TSN uses. I've only approximated it for the range of buys and sells that I have data for. That it might not be the actual formula is really irrelevant so long as it is applied for data in this range. If you have a guy that just goes off the charts in terms of buys and/or sells, then all bets are off. It would be nice to validate it, though.

Somebody clue me in. What's been the largest price gain/loss this year? Surely over 160K.
 
39Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 13:56
Another thought is that they may use a function that is piecewise linear.
 
40Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 14:05
Answered my own question. -170 is the largest drop. +150 is the largest gain.

So I would think that it's somewhere in there, and that 325 isn't too far off.
 
41Guru
      ID: 330592710
      Wed, Nov 05, 2003, 14:11
Maximum loss so far is -170, which was the first day loss for Lamar Odom. The second worst loss was Duncan, who dropped 160 today.

Max gain has been 150 (twice for Arroyo on Oct 30 and Oct 31).

 
42Guru
      ID: 330592710
      Wed, Nov 05, 2003, 14:28
Half of 325.7 is actually ~163, which might round to 170.

If I apply the formula above to the observed trade data, and also identify those who are in and out of gravity, I also get results that are never more than $10K in either direction. Seven prices are off on Nov. 1, and 17 are off on Nov. 2. Interestingly, most of the values that are off fall very close to the middle of the rounding range, suggesting that my rounding algorthm might be a source of tracking error. Perhaps rounding actually occurs in stages (for example, if the buy and sell percentages are only carried to a limited number of decimal places).

Why these (seemingly arcane) factors?

First, as has been shown above, there is more than one way to express this formula. Maybe in a different (but equivalent) representation the factors work out ro round nubers. Regardless, my guess is that TSN first tried to develop an approach to get the dampening effect at the extremes. A little trial and error could have done that. (In the form that Sludge presents, maybe the actual underlying factor is 16.)

Then, they would have scaled the multiplier (what I have typically referred to as the "sensitivity factor" up or down to produce the desired magnitude. From year to year, this sensitivity factor has seemed to change, but generally remains constant throughout a season. But in at least one year, there was an abrupt observable change after about one week of price changes, when initial roster values changes appeared to be inflating too fast. TSN (or perhaps Small World) never fessed up to the change, but the evidence was compelling.
 
43darkside
      Dude
      ID: 3590317
      Wed, Nov 05, 2003, 14:34
I will never forgive Ramon for that.
 
44Guru
      ID: 330592710
      Wed, Nov 05, 2003, 14:34
BTW, if I change the factors to 340 and 16 (i.e., nice round numbers), the results are almost as tight. There are still no errors greater than + or - 10K, and there are only 35 of these minor tracking errors for the two days, rather than 24.
 
45Species
      Leader
      ID: 569221717
      Wed, Nov 05, 2003, 14:42
lol Darkside....and don't forget Jaret Wright, if I'm not mistaken (as I viciously try to shake the cobwebs from my memory)
 
46Smartone @work
      ID: 22815228
      Wed, Nov 05, 2003, 14:43
great job, Sludge and Guru!

now, as the baseball season is over, will someone in TSN be brave enough to stand up and admit that they've forgot to "LOG" the pricing function in last season's unforgetable Griffey Jr.'s price-drop...
 
47darkside
      Dude
      ID: 3590317
      Wed, Nov 05, 2003, 14:52
It's tough to clear away the cobwebs, Species...seems like ages ago.

Thanks a lot, Guru and Sludge!
 
48Guru
      ID: 330592710
      Wed, Nov 05, 2003, 15:06
What's it all mean?

There are several potentially useful findings:

1. The max daily gain or loss is probably $170. If we ever see something slightly larger, then we might need to recalibrate the formula. If we see something much larger, then it is likely that TSN changed the formula.

2. To produce a maximum gain, a player would need to have about one-third (or more) of all buys for the day (and no sells). The produce a max loss, a player would need to have one-third (or more) of all sells for the day (and no buys).

Here's a quick table that maps the percentages into a corresponding price change (assuming trades are either all buys or all sells):

 %   Gain(or Loss)
100 $170
33 170
25 160
20 150
15 140
10 110
8 90
6 70
4 50
2 30
1 10


(Note: the point at which the gain or loss maxes out at $170 is highly sensitive to small changes in the formula, so the peak of 170 might require a much higher concentration. However, the rest of the table is quite stable for small changes in the factors.)
 
49Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 15:26
Another function that has the same behavior is arctan. That and the logistic function would have been my first choices if I were seeking to dampen the extreme price changes without simultaneously dampening the moderate price changes. Looking at fitting that instead of the logistic curve, it's virtually identical except it would allow for a maximum loss/gain of right at 200.

Price Change = 143.51*arctan(9.19*x), where x is as above.

Note that the extreme is when every buy is for one player and he has no sells. That makes the "x" for that player log(1+1) - log(1) = 0.0.693. Plug that into the price change equation, and you get 203. 200K is an awfully nice round number, don't ya think? As Dana Carvey once said (many times) while dressed in drag, "How convenient."
 
50Sludge
      Sustainer
      ID: 25919714
      Wed, Nov 05, 2003, 15:44
More picture goodness:



This shows the curves projected to the absolute extremes as described above. I wouldn't be doing my job if I didn't point out that this is extrapolation, and extrapolation is quite often pure crap. I should also be taking points off of myself for not properly labeling my graphs. I'll fix that.
 
51RecycledSpinalFluid
      Dude
      ID: 127421915
      Wed, Nov 05, 2003, 15:48
LOL: "extrapolation is quite often pure crap"
 
52Guru
      ID: 330592710
      Wed, Nov 05, 2003, 16:03
Would that be better termed "excrapolation"?
 
53Blooki@Work
      ID: 24920611
      Wed, Nov 05, 2003, 16:36
Guru,

Grant Hill (02-03)

There's one for you. I'm not sure what to conclude from it. I'm sure those days where Hill defied gravity are not due to trading activity regarding Hill. This would lead me to believe that trading activity or ownership of players other than Grant Hill would somehow affect the "in/out of gravity" calculation for Hill. I don't have any more significant data on this, but I would appreciate it if those who are running tests could be on the lookout for anomalies such as this one.

I remember mentioning this last year, but at the time nobody seemed interested in cracking intricacies of the pricing algorithm and I'm glad this discussion has piqued everyone's interest now.
 
54Guru
      ID: 330592710
      Wed, Nov 05, 2003, 16:44
Yeah, I think there were several last year who seemed to shift in and out of gravity with no obvious reason. although I don't think anyone ever tried to run ownership samples on those players to be able to assess the underlying data.
 
55Astade
      ID: 214361313
      Thu, Nov 06, 2003, 23:55
Hey Guys,

I was hoping to investigate the price change formula myself, but this last week I've been busy with work.

My question to the general audience who have been following this (especially Sludge), weren't there 2 different formulas for price changes? One for Gainers and one for Losers?

Don't get me wrong, you have done a great job fitting the current data available, but from all the background research/releases (from TSN), wasn't it decided that %Sells would have a different weight than %Buys?...Yet, Sludge you seem to combine the two into one universal equation?

If I'm completely off-base, please correct me quickly, so I can self-edit (and delete my post) ;)
 
56Sludge
      Sustainer
      ID: 24981818
      Fri, Nov 07, 2003, 00:52
Astade -

Ummmm, maybe.

The data don't seem to support that hypothesis, however. It might explain the "shoelaces" as Guru has called them, but I doubt it. Even if it does, how much will it kill you to be off by at most 10K? (With the usual caveat that we haven't checked the model on the extreme price changes yet.)

I will try it with two different scale parameters for the two tomorrow, though, and see if that improves the fit.

I don't haunt the hoops forum, but I thought that the reason that people assign a different weight to buys versus sells is because there will always be more buys than sells, so that when they are scaled by dividing by the total buys and sells (respectively), each buy has a weight of 1/Total Buys and each sell has a weight of 1/Total Sells (paradoxically, this actually removes the inordinate weighting that would be assigned if we were to just use the raw number of buys/sells). When building the model, I operated under the (possibly mistaken) assumption that it would be silly to say that a player that gets X% of the total buys and X% of the total sells would move in price at all. In that case, either of the models above would predict a price change of 0. If each variable is assigned a different weight, then this won't be the case and, as you can see from the data, isn't the case.
 
57Sludge
      Sustainer
      ID: 24981818
      Fri, Nov 07, 2003, 00:55
Err... umm... that last bit should read, "...and, as you can see from the data, is the case.
 
58Tortfeasor
      Sustainer
      ID: 458192922
      Fri, Nov 07, 2003, 01:14
Blooki-

I'm no mathemetician/statistician, but your Grant Hill example makes me wonder: perhaps there are certain minimum values accorded to a player? I know the minimum is supposedly 500K, but why couldn't they set the bar higher for some players?

Just a thought, may be way off base here.
 
59Guru
      ID: 330592710
      Fri, Nov 07, 2003, 08:47
Astade - previous research showed that the weighting of a buy was different than the weighting of a sell. But that is because there are a different number of total buys than total sells. 100 buys might represent 5% of the total buys, while 100 sells might simultaneously represent 6% of the total sells. Buy first converting buys and sells to percentages, the need for different weightings has been taken into consideration.

In past incarnations, we sometimes factored down the buys by a constant percentage so that total adjusted buys were equal to total sells. Then we subtracted sells from factored buys. Ultimately, we also multiplied by a sensitivity factor that had the total number of sell in the denominator.

If your rearrange the terms, that is exactly what Sludge's formula does as well. So this is not inconsistent with the previous findings.
 
60Blooki@Work
      ID: 24920611
      Fri, Nov 07, 2003, 09:50
Tortfeasor,

That would be a possible explanation for why Hill stops gravitating at $4,200,000, but if you look earlier, he occasionally stops and resumes gravitating at other points in the season for absolutely no good reason. I find it extremely unlikely that he's being traded at these times.
 
61Tortfeasor
      Sustainer
      ID: 37948287
      Fri, Nov 07, 2003, 14:24
Blooki-

Right. But it could be that some teams that are inactive had him, checked them once, made one or two sells, thus taking him out of gravity. If I remember correctly, gravity applies only when a player is not being bought or sold at all. So if you sold him, that could take him out of gravity, right? I never understood whether sells could take a player out of gravity or not. Just a guess.
 
62Guru
      ID: 330592710
      Fri, Nov 07, 2003, 14:52
The recent data does indicate that several buys can take a player out of gravity - although a single buy doesn't seem to be enough. I haven't found the empirical evidence to suggest that sells do or don't have a similar impact.
 
63Deadeyes
      ID: 50104029
      Thu, Nov 13, 2003, 19:00
INteresting stuff you got going on here. If you give me the data i can probably solve it using matlab.
 
64 Pacers Rule
      ID: 4095190
      Thu, Nov 13, 2003, 19:08
Deadeyes, welcome to the forum of the big boys. You have to behave over here. If you don't, just find a post by Jerry Lewis to see what will happen to you...
 
65Perm Dude
      Dude
      ID: 30792616
      Thu, Nov 13, 2003, 19:11
hola, deadeyes. Was hoping you'd make it to the forum.

pd
 
66Sludge
      Leader
      ID: 24981818
      Thu, Nov 13, 2003, 23:08
It's not solved already?

Hmph. I thought I did a pretty damn good job.

The data's linked in #12. There's only two day's worth there. I'm still waiting for RSF to automate the whole thing. :)
 
67Sludge
      Leader
      ID: 24981818
      Thu, Nov 13, 2003, 23:25
By the way, with more data (I don't think the data set we have now is rich enough, which is why I really didn't spend much time on it), the way I would approach solving the gravity problem is using discriminant analysis. For those that don't know what it is, I'll just quote MINITAB's help file:


Use discriminant analysis to classify observations into two or more groups if you have a sample with known groups. Discriminant analysis can also used to investigate how variables contribute to group separation.


Seems like the right hammer for this nail to me. So basically, all we would have to do is code each player for each day as subject to gravity or not (there will surely be some errors inducing a bit of random variation, but that's not a problem), and my first guess as discriminants would be indicator functions of raw buys, raw sells, and ownership. (My second guess would be percentages of the above.)

Another option would be logistic regression since the dependent variable is dichotomous.

A nice graphical technique would be to plot Buys vs. Sells vs. Ownership (a three dimensional plot) coloring the gravity points red and the rest black. Figure out what surface separates the points, and you have your rule. My guess is that it might be something as simple as a plane.
 
68RecycledSpinalFluid
      Dude
      ID: 204401122
      Thu, Nov 13, 2003, 23:27
Sludge, the DB (uncompressed is 50+ MB) but I will make the share I showed you available again (Now that I've recovered from router and server issues).
 
69Sludge
      Leader
      ID: 24981818
      Thu, Nov 13, 2003, 23:34
Oh man, you want me to go through all of that again? :) That code I sent you only goes so far (just listing players bought and players sold... it doesn't even aggregate them yet, although that's a 1 minute coding job to get it to do that since the procedures are already built into R). The real pain is matching up the price changes with the players bought and sold.

I'll do it if, uhhh, 5 people donate 5 bucks each to Guru. Yeah, that's the ticket.
 
70RecycledSpinalFluid
      Dude
      ID: 204401122
      Thu, Nov 13, 2003, 23:38
I can do something about the price change thing that should make it easier for you. You mind sending me the URL for the price change thing (I used to have it, but now it is lost in the maze of links on my network).
 
71Sludge
      Leader
      ID: 24981818
      Thu, Nov 13, 2003, 23:39
Guru can probably beat me to it, as I have it in my Outlook at the office and can't get to it until tomorrow morning. But if he doesn't, I'll send it over to you.
 
72Deadeyes
      ID: 50104029
      Thu, Nov 13, 2003, 23:59
Just got back from the bars. Who is jerry lewis? what happened to him. I have a short fuse. Not sure if i can last with the big boyz. But i can probably try to solve the equation for the price movers when i have time. Good night. Is this more popular than the chat rooms in the other forum on TSN?
 
73Guru
      ID: 330592710
      Fri, Nov 14, 2003, 08:49
TSN Ultimate Hoops daily prices. (Let me know if you need the codes to match player IDs into names.)
 
74Guru
      ID: 330592710
      Fri, Nov 14, 2003, 08:54
This file has the player IDs in it. Columns 4-7 are the ID#, and the name follows shortly thereafter.
 
75Deadeyes
      ID: 198319
      Fri, Nov 14, 2003, 09:28
Does it have anything to do with points scored by the player or just ownership buy/sell? Do we know this? From the graph it doesnt seem to be just by % ownership. Looks like another variable is involved.
 
76Guru
      ID: 330592710
      Fri, Nov 14, 2003, 09:37
Points scored are not a factor.
 
77Sludge
      Leader
      ID: 511027148
      Fri, Nov 14, 2003, 09:38
Entirely possible, deadeyes, but I would guess not. If there were another variable that is used to compute price changes, it would have to have a very small impact. The two models above are off about 10% of the time by 10K in either direction (and nothing more than that in the sample that I have), so that means that any other variables included would have a very low impact on the price change and, to my way of thinking, why would they bother including them then? (All of this begs the question, what's the correlation between points scored and ln(%Buys+1)-ln(%Sells+1)?)

My guess, though, is that we've got the variables, but we might not have the exact form that they are used. Why did I use ln(%Buys+1)? Why not log base 10? Why not ln(%Buys+C), where C is something other than 1?
 
78Guru
      ID: 330592710
      Fri, Nov 14, 2003, 09:39
...and I think the graph looks pretty tight. Why do you think the graph shows evidence of another factor?
 
79Sludge
      Leader
      ID: 511027148
      Fri, Nov 14, 2003, 09:45
Actually, it doesn't matter which base you use for the log:

ln-base-N(x) = ln(x)/ln(N)
 
80Sludge
      Leader
      ID: 511027148
      Fri, Nov 14, 2003, 09:47
Oops... that should be

log-base-N(x) = ln(x)/ln(N)

So used to just typing "ln" when I'm writing about logarithms.
 
81Smartone @work
      ID: 22815228
      Fri, Nov 14, 2003, 13:10
a little OT, but I was wondering if you could help me, as I am interested to build a "next-generation" simulation tool based on historical data. In order to do so I need to extract information from the web. Unfortuately, I am using Office 97 (on Windows XP) and I couldn't, so far, understand how "web queries" work. Can anyone give me a little help here? Unfortunately, the only "programing language" I know is Excel and a little Access.

My goal is to have an input database of all of last year's (perhaps this yaer too) games and the TSNP of each player in these games (Minutes Played can also be useful).
 
82RecycledSpinalFluid
      Dude
      ID: 510441411
      Fri, Nov 14, 2003, 13:16
Smartone, actually, if you are using Office 97, you don't need to use web queries, just open the URL as you would a file.
 
83Guru
      ID: 330592710
      Fri, Nov 14, 2003, 13:34
smartone, not quite sure what data detail, you need, but I have an Excel spreadsheet that has minutes played for each player for each game, as well as TSNP for each game. I could provide these for several years as well. Shoot me an email if you're interested.
 
84 Smartone @work
      ID: 22815228
      Fri, Nov 14, 2003, 13:55
thanks a lot, Guru!

I think that 2002/03 data as well as this year's will be great
 
85Deadeyes
      ID: 50104029
      Sun, Nov 16, 2003, 20:06
Hey when you used excel did you use the rounding up or down function? i wonder if you did if the points would match up.. I believe this si the function you need to use:

ROUND(number,num_digits)

i believe the number can be the equation. .try it and see if the points match and let me know.
 
86Guru
      ID: 330592710
      Sun, Nov 16, 2003, 20:16
Yes, I used the excel rounding function. I could not get an exact match, but I suspect the rounding approach is the culprit, as most of tracking errors were on values which were close to the middle of each rounding range.

The TSN program may round at several intermediate steps. Or perhaps some of the underlying factors are just slightly off. Or both.
 
87 Pacers Rule
      ID: 4095190
      Sun, Nov 16, 2003, 23:24
Something is rotten in Denmark. Looking at a comparison between change in ownership and price changes the other evening (with the naked eye, no stats program needed) found that proportion of player sells to total sells cannot be the only factor in price changes. I saw that a player who had a delta of around 600 didn't really change in price that much more than a player with a delta of 100. Seems like with a 6:1 ratio of sells, there would have been a bigger difference. Sorry I don't have the exact examples. I didn't write them down but thought that later I might remark about in case anyone cared to check into it. I wondered if it might also have something to do with total buys/sells in relation to total ownership, as the 100 delta was on a lot less owned player than the 600 or so. Each delta represented a similar percentage respective to each player, if that made any sense. like each player had 20% of it's owners selling for example.
 
88Guru
      ID: 330592710
      Sun, Nov 16, 2003, 23:31
Your going to need to point out that example. I haven't seen anything like that.
 
89penngray
      Sustainer
      ID: 423241723
      Mon, Nov 17, 2003, 11:49
smartone, Re #81 if you are considering doing any serious web based parsing look into learn vbscript and/or even better Python or PHP (web development tools). Handling of web page source is much more robust once you learn any of these. Plus I really think once you learn them you will be able to generate your own very functional web pages for simulations.

Just ask RSF how much simpler things are with scripts ;)


 
90Sore Thumb
      ID: 571049813
      Wed, Nov 19, 2003, 15:15
This is off the topic a little but I was wondering if there is anyway in determining how many different people use the hoops forum. I would like to see the ratio of guru users compared to the 5,000 people in the TSN game. Kind of see how big of an influence these boards might have.
There is no way in knowing how many view but dont post but a round about figure could be interesting.

Thanks Guru.
 
91Guru
      ID: 330592710
      Wed, Nov 19, 2003, 15:46
There is no accurate way that I can monitor this, because a lot of people come from multiple points of entry (home, work, school, library, etc.)

There is also no easy way to know how many distinct managers are represented in the 5000+ teams - although perhaps this is something that RSF could track in his sampling, by keeping track of manager name (or number) in addition to team number.

It would be interesting, though. I'll set up a separate thread to start an informal survey.
 
92smartone
      Donor
      ID: 29452720
      Sat, Nov 22, 2003, 10:59
can anyone check why Chris Bosh continues to gravitate(?) while his ownership level is over 2% and people continue to buy him?
 
93Guru
      ID: 330592710
      Sat, Nov 22, 2003, 11:28
Just glanced at Bosh's ownership, and he does looks like an anomaly. His ownership on Nov.20 was 140 teams, more than 2.6% of rosters, which seems to be well above the gravity threshold that was in place earlier in the month. He also had 11 buys on 11/20, which should have kicked him out.

It might be time to get some complete details and rerun the numbers to see whether there is a plausible explanation that is consistent with the prior data.
 
94RecycledSpinalFluid
      Dude
      ID: 510441411
      Sat, Nov 22, 2003, 15:41
The gravity cutoff looks like its at about 3%, which would include Bosh. Haven't looked at past numbers, but 2.8-2.9ish seems to be a number that is sticking in my head from previous looks.
 
95Guru
      ID: 330592710
      Sat, Nov 22, 2003, 16:44
RSF - based on the earlier analysis in [26] and [28], the gravity cutoff was around 2%.
 
96JCS
      Sustainer
      ID: 20102934
      Fri, Dec 05, 2003, 11:50
Hmmm.. let's look at the latest ownership numbers. Murray gets a -344 delta and loses 80k, and Billups gets a -349 delta yet loses 100k. This may be an evidence that the player's price is a factor in his price change.
 
97Guru
      ID: 330592710
      Fri, Dec 05, 2003, 13:11
It may also be a consequence of differences in the buy/sell distribution of those net changes.

For example, Billups may have had 349 sells and no buys, while Murray may have had 364 sells and 20 buys. That might cause a slight difference in the formula change. Rounding could also account for part of the difference.
 
98Pacers Rule
      ID: 910311210
      Fri, Dec 05, 2003, 15:00
RE #87
I haven't confirmed this for a fact, but the thought crossed my mind that I could have been looking at the current price moves while looking at the previous day's ownership numbers, which would explain why there wasn't a match! I probably forgot that the ownership numbers don't come out until mid-day the next day.