RotoGuru Baseball Forum

View the Forum Registry


0 Subject: Pitching Stastics, Part III

Posted by: Madman
- [29246911] Sun, Mar 11, 21:07

In pitching statistics part II, I gave some tables that showed that pitcher who experience below average or above average starts in the recent past tend to reverse their trend.

In this thread, I'll analyze which pitchers tended to fit this pattern, and which did not. What you'll find is that, as you would expect after reading Pitching Stats II, is that it is usual for a pitcher to do better after 3 bad starts and vice versa. Further, however, you'll see how this may affect some pitchers more than others.

With only 30-35 starts, it's always possible that the results for any one of these starters is due to random chance. For each of the pitchers shaded with brighter colors (green or purple), you are 95% sure that there is a non-zero effect for that pitcher. Of course, with this many pitchers being listed, you'd expect 1/20 to significant with just random chance.

In the below table, I report the coefficient and standard error from a least-squares regression run individually for each pitcher. The regression was (next game SWP) = b*(average of last 3 SWP) + constant.

"b" is listed in the second column. The standard error for "b" is listed in the third column.

You can interpret this as follows:

* For each additional point that Al Leiter has gained in his last 3 starts, you'd expect his next start to be reduced by 0.88 points. And you are 95% sure that his next start will be reduced by some amount greater than zero.

Hypothesis

* if you believe that pitchers are "streaky", then you would expect each pitcher to be listed in green.

* if you believe that pitching statistics are "counter-cyclical" -- i.e., bad performances follow good and vice-versa, you should expect a lot of purple.

So, to the data!

NameCoefficientStd. Error
ALeiter -.88 0.39
Ankiel .04 0.46
Appier -.73 0.39
Arrojo .28 0.34
Ashby .17 0.30
Astacio -.38 0.46
BAnderson -.32 0.36
BRadke -.35 0.39
Burba -.30 0.35
Clemens .02 0.36
Clement -.72 0.32
Colon -.24 0.43
CPark .04 0.34
Dempster -.01 0.39
Dreifort .11 0.34
DWells -.74 0.33
Elarton .15 0.33
EMilton -.08 0.29
Finley -.05 0.31
GaStephenson -.11 0.35
GHeredia -.09 0.34
Glavine .21 0.27
GMaddux -.29 0.33
Halama -.25 0.37
Helling .29 0.28
Hentgen -.55 0.38
Hermanson -.42 0.34
Holt -.32 0.31
Hudson .04 0.31
JBere -.58 0.36
JHaynes -.20 0.36
JSanchez -.47 0.41
JWeaver -1.08 0.43
KBenson .06 0.37
KBrown -.28 0.35
Kile -.41 0.31
LHernandez -.62 0.41
Lieber -.09 0.38
Lima -.05 0.31
Loaiza -.41 0.32
Meadows -.22 0.35
MHampton -.14 0.31
Millwood -.14 0.31
Mussina -.77 0.36
Neagle .04 0.29
Nomo -.45 0.33
Parque -.38 0.34
Parris .20 0.35
Pettitte -.94 0.41
Ponson -.08 0.35
Rapp -.45 0.39
Reynoso -.95 0.45
Ritchie -.44 0.38
RJohnson .23 0.32
Rogers -.23 0.32
RReed -.13 0.32
Rueter -.26 0.37
RuOrtiz .39 0.28
Rusch -1.08 0.39
Sele .09 0.28
SEstes -1.11 0.36
Sirotka -.12 0.40
STrachsel -.27 0.29
Suppan -.50 0.39
Tapani -.03 0.44
Vazquez -.14 0.31
Wolf .20 0.32


Conclusions

Notice the total absence of bright green from the above chart. This means that, counter to what would be expected from random chance, absolutely no pitcher's data evidence a large enough streak behavior to be 95% certain that the streak behavior would have existed. In fact, no on is even close.

Secondly, notice that there aren't very many bright purple. In fact, I would say that this data doesn't statistically prove a counter-cyclical trend. However, I believe it is highly suggestive. The fact that the vast majority of pitchers are counter-cyclical in their point estimate is rather striking.

The players in bright purple I might keep an extra eye on. They might be a stastical fluke (remember when you run this many regressions, you'd expect some to come out significant). Nonetheless, your best guess is that the bright-purple pitchers tend to be better able to recover after bad outings. Or, are less likely to keep a hot streak going.

The next step would be to run a fixed-effects model on the data to try to peg down this counter-cyclical issue for once and for all . . . But what is above is yet another piece of evidence that is consisent with that hypothesis, and contrary evidence for those who still believe that pitchers are streaky.
1HooeyPooey
      ID: 362451122
      Sun, Mar 11, 23:01
I'm afraid that one of these days you (or Sludge or someone) will work out the 'optimal' SW baseball strategy. (Not really, but you never know.) It does remind me though, of last year when someone suggested putting together a computer program to run a SW team. Throw a few of these analysis in and do some on batters along with SW pricing, and you could probably do it. :)
2Madman
      ID: 29246911
      Sun, Mar 11, 23:34
Well, there's the 'Optimal SW baseball strategy' after the fact. A lot of people COULD calculate that, but it would take a lot of work, and wouldn't tell us much, I'm afraid (at least relative to the work involved).

A lot of people could design a computer AI to make decisions based on a variety of criteria. Although it's possible that a computer could do better than a human, I think it would be exceptionally hard. Akin to computer chess. Even if you could design a computer brain that would win, it would take a ton of resources to develop.

Personally, I'm of the philosophy that the best thing to do is crank all the numbers you can, think about the evidence in as many ways as possible, let it seep into your subconscious.

And then go with your gut.

In other words, I do all this basically to make sure I feed my subconscious personal computer the best information possible.
3Perm Dude
      ID: 28059111
      Mon, Mar 12, 00:08
Madman, maybe this is in the other thread, but how did you choose these pitchers? Two pitchers who had awful losing streaks last year (Cone and Lima) are not listed.

pd
4Tim G
      ID: 1611393123
      Mon, Mar 12, 00:12
Fascinating stuff Madman but I have a few
questions.

First, for a given pitcher, did you take 4
consecutive starts to generate an x,y data
point, where x = "average of last three SWP"
and y = "next game SWP"? For example the
first x,y point would be start 1,2, and 3 for x and
start 4 for y. Then step up one start for another
x,y point. I think this is obvious, but I wanted to
clarify.

Second, take Leiter for example, I don't see
how with b = -0.88 he could do anything but
decrease his next game SWP regardless of
his last three, unless the "constant" is not a
constant throughout the season. What do the
constants for the various pitchers look like?

Third, Ankiel has a very small slope (b = 0.04),
does this mean his next SWP always looks
pretty much like his last three?

Another thing you might look at if you run out of
ideas (which I doubt) is to look at the SWP
distribution of these 30 - 35 start pitchers. You
can fit them to some equation (Gaussian for
example) to see who is prone to large swings
in SWP, either positive or negative. Richard
did something like this last year.

Sorry all for the narrow lines in my post, a
technological problem.
5Tim G
      ID: 1611393123
      Mon, Mar 12, 00:18
Let me change my Ankiel comment. With a slope near zero, his next SWP would be very small. Again what are the constants like?
6HooeyPooey
      ID: 362451122
      Mon, Mar 12, 00:33
I don't think a truly 'optimal' realtime strategy is possible, but I use optimal in the sense of providing an edge. It's hard to say when a computer actually makes decisions, since ultimately it is human knowledge that allows the computer to decide. With a game as complex as this, developing a good computer AI would be difficult. I know from experience AI programming is some of the hardest because you have to break each minute thought process down, which we take for granted and do so fast in our heads. However, once the correct AI is in place, the computer excels in what it does best, and that is fast calculations.
7Madman
      ID: 29246911
      Mon, Mar 12, 06:24
PD I just did this for pitchers with 30+ starts. I could easily change this restriction with my VBA program.

Tim G Lots of good points.

a) I took all 4-start observations. x = average(1,2,3). Y = 4. Exactly as you described.

b) Each pitcher has a different constant term. Better pitchers generally have a larger constant. I can report this if you want. Just didn't think about it, since the hypothesis I was specifically interested in was the slope coefficient.

c) My answer in b) actually addresses your concern about Leiter. All the -0.88 means is that the 3-start average and the next start is negatively related. Large 3-start averages are followed by small 4th starts and vice versa. It's the vice versa part that you're wondering about.

Consider Leiter's first 3 sets of 4 starts (3-start average, next start):
68.7, -18
19.0, 32
7.0, 117

You'd get a negative slope of -1.8 with only these 3 observations. The intercept is 101.

All the -1.8 really means is that the higher the 3-start average, the lower the next start. But, as you can see, if you have a very low 3-start average, then you'd predict a high next start. The predicted point values for the first 3 pairs are basically -24, 66, 89. As the 3-start mean decreases, the prediction increases.

d) Ankiel's small slope doesn't exactly mean that his next start will look like his last three. Rather, it means that his last three don't add any predictive value. In Ankiel's case, you should just take his overall average SWP, regardless of what he's done lately (recall from stats that the intercept term in a least-squares regression is just the mean of the dependent variable if all of the slope coefficients are zero).

e) I'm not sure I follow your test in the second to last paragraph. Sounds interesting, however.

f) I've had my share of browser and computer difficulties, myself.
------------------------------
Hooey Pooey I'd agree with all that. I'm not up to AI programming at the moment, however :). Still working on my roster management and pitcher tracker excel programs :).
8Perm Dude
      ID: 28059111
      Mon, Mar 12, 09:15
Thanks, Madman. I understand that you would want a representative, valid, sample, so 30 starts seems reasonable. I wonder, though, if given the pressures of working in MLB losing streaks of any magnitude would be rare, since pitchers that start to lose are taken out of the rotation (and released, or sent down).

I guess that's just the nature of this data universe, and we run with what we can.

Thanks for all the insights.

pd
9Madman
      ID: 29246911
      Mon, Mar 12, 16:28
PD limiting to starting pitchers with 30 starts the server-load issues for the Guru, too :)

Actually, the sample size is still too small, IMO. To do this right you should do a fixed-effects estimator across all the observations. Or at least across all the pitchers with 30+ starts.

The large quantity of negatives, however, I found to be interesting, nonetheless. Not all good results are statistically significant.

Maybe sludge could add something about this whole thing.

Your self-selection issue is reasonable. It was a bigger issue with Pitching Stats II. Here, Each pitcher is estimated separately. Therefore, the presence of other pitchers excluded from the sample is irrelevant.

This is actually why posted this thread. Basically, this is an extreme that shows the flip side of the Stats II thread argument. Since both come up with similar results, I find that persuasive. But more could be done.

Further, I'm really interested in Leiter and some of these guys. Will they continue to have this strong negative relationship? If so, this could be invaluable information regarding when to pick them up.
10Myboyjack
      ID: 4443038
      Wed, Mar 14, 09:29
Madman or Sludge, This isn't on topic at all, but I didn't want to start a new thread for this question. Here is an interesting article by Eddie Epstein dealing with the greatest pitching staffs of all time. In it he measures them by compiling a list of teams which finished the most standard deviations better than the league average in runs allowed. Could you educate me as to what he means by standard deviation? Thanks.
11Wammie
      ID: 20039259
      Wed, Mar 14, 09:54
Standard deviation is "the mean of the mean". You can look it up in any stat book, or on tons of web pages. but in a nut shell, if you graded a class on a curve, the shape of the grades will look like a bell, thus the bell curve. then off of the bell curve, the middle area are people who will get C. one standard deviation above C is a B and two standard deviations above C is an A. it goes the other way too with two standard Deviations below C being an F. This is not to say there are only 6 deviations, just that most data will fall in those ranges. You could take an entire class on standard deviation though.
12Wammie
      ID: 20039259
      Wed, Mar 14, 09:58
How to find it.

x = one value in your set of data
avg (x) = the mean (average) of all values x in your set of data
n = the number of values x in your set of data

For each value x, subtract the overall avg (x) from x, then multiply that result by itself (otherwise known as determining the square of that value). Sum up all those squared values. Then divide that result by (n-1). Got it? Then, there's one more step... find the square root of that last number. That's the standard deviation of your set of data.
Now, remember how I told you this was one way of computing this? Sometimes, you divide by (n) instead of (n-1). It's too complex to explain here. So don't try to go figuring out a standard deviation if you just learned about it on this page. I just hope that you've now got a grasp on the basic concept.

13walk
      ID: 212358
      Wed, Mar 14, 11:07
Madman. Your advice on what to do with your stats and analysis are like "spot on." I read them, I comprehend some of it, and then I digest it and then transfer it from my gut to my cerebral cortex and then hopefully call up this information in a gestalt kinda way when I have to make decisions, and then I make hasty one's using my gut anyway!

;-)

Thanks,
- walk
14Sludge
      ID: 1440310
      Wed, Mar 14, 13:15
Myboyjack & Wammie -

The best interpretation for a standard deviation is that it is a measure of the average distance an observation will be away from the mean. So if you say that a pitcher has an average SWP of 80 and a standard deviation of 25, you could interpret it as meaning that on average he will be about 25 SWP away from 80.

Madman -

Did you examine this data as an autoregressive time series? If I recall correctly, if you do so, your estimates of the parameters will be the same, but the estimates of the standard error may not. In a typical regression, each observation is assumed to be independent of each other observation. In a time-series analysis, they are not assumed to be independent. Makes a big difference in estimating variances.
15Madman
      ID: 29246911
      Wed, Mar 14, 18:19
sludge I am bad with time-series. But let me state my specification and logic, and then let you rip it apart.

y4 = a + b(y1+y2+y3) + e4

is the basic specification (e4 is iid). For all intents and purposes, this is an AR(3) process (only difference is the restriction on b to be the same across all previous observations). This fits under the umbrella of a classical regression model, with one caveat -- the stochastic nature of the y1, y2, y3.

The problem is that the sum of the three y's will be correlated across observations. I.e., if y3 is very low, then the sum of y's is likely to be low for the independent variables in the y4, y5 and y6 estimation equations.

However, under a pair of assumptions, you can get the X's (i.e., y1's,y2's,y3's) to produce the asymptotic distribution of the LS estimator.

One of these assumptions (I think) is that (essentially) the beta's are less than one (which is generally the case empirically, so I don't feel too bad about this assumption). This is important for a reason I can't type out quickly in a message board post.

The second assumption relates to the matrix of cross-products of the indepedent variables with the lagged values of the same indepedent variables. Basically that the correlation between x's depends only with how far apart in time they are, not with respect to exactly when the observation occurred. This ensures that the process generating the x's converges to population quantities in the same manner that means from IID populations do. This is a stationarity assumption, essentially. I have no evidence to support the validity assumption, but it seems reasonable to me (on its face, at least).

So, that's basically all I know. I guess the short answer is "no, I didn't" adjust the variances. These are straight OLS estimates.

If I'm still missing something, please post. As I said, I'm no great shakes with time series.
16Sludge
      ID: 18116195
      Wed, Mar 14, 22:27
Madman -

I thought you said you didn't know much about time series? You just summarized the conditions necessary for weak stationarity.

Anyway, could you provide me with the data? I'd be curious to take a quick look at it.
17Madman
      ID: 29246911
      Wed, Mar 14, 23:10
sludge You can download the data in a comma-delimited text file from my gaming website,

Madmans Gaming Resources.

Kind of sad that I have a webpage devoted solely to games . . .

At any rate, I've had some time series, but I've almost never used it. I feel like I know some of the words, but I'm not sure I always follow what they really mean . . . Glad to know that at least I wasn't totally incoherent with my ramblings in the last post. :)

I think what might be best is some sort of fixed-effect model that would essentially pull the performance mean for each pitcher out of the mix, and thereby allowing you to pool the pitchers together. I dunno. You can feel free to take a look, however.
18Sludge
      ID: 1440310
      Thu, Mar 15, 11:27
Madman -

Did a little digging in a TS text. The differences are as follows:

Regression - Assume independent observations. If normal errors are assumed, then parameter estimates are exactly normally distributed.

Time Series - No independent observations. If normal errors are assumed, then parameter estimates are asymptotically normally distributed.

Parameter estimates and estimates of variances differ only slightly. However, because of the small sample sizes, I wouldn't trust any confidence limits computed using normal techniques.
19Madman
      ID: 29246911
      Thu, Mar 15, 15:48
sludge Yep. My main point with all of this wasn't to prove anything with statistical precision. But I thought the vast preponderance of negative coefficients was rather striking.

This set of results, in which no assumptions were made to pool observations, is complementary to the pitching stats II thread (in which all starters were normalized and pooled), I think. If you'll notice, I was very careful (I think) to try to avoid any strict reliance on the variance estimates.

I realize that I need to estimate a more sophisticated model to really prove anything. But given that this isn't the most important of tasks . . . :)
20biliruben
      Sustainer
      ID: 231045110
      Sat, Jun 16, 14:07
We now have enough data to begin thinking about these things in 2001.
21steve houpt
      ID: 5155219
      Sat, Jun 16, 15:37
Madman - this may not matter for what you are doing. But depending on whether a pitcher wins or loses has a drastic affect on individual game SWP's. Possible 45 point swing. Same line except for W/L could be 90 and 45 as far as SWP's. Whether he wins or loses is not always indicative how he pitched [but winning is the name of the game]. Whether he wins or loses an individual game is more random IMHO than how he pitches. Long term, it may be easy to predict probable W/L record based on how a pitcher pitches and his support.

Last year during the season I ran some tracks [don't have anymore] of pitchers with minimum of XX starts and their SWP's less W/L points to try and give a quick reference for pattens. Off to the side I had +/- how many W/L SWP's they averaged per start. When looking at any pitcher you are going to take into account probabilitiy of win based on HIS normal run support and opponent.
Rate this thread:
5 (top notch)
4 (even better)
3 (good stuff)
2 (lightweight)
1 (no value)
If you wish, you may rate this thread on scale of 1-5. Ratings should indicate how valuable or interesting you believe this thread would be to other users of this forum. A '5' means that this thread is a 'must read'. A '1' means that this is a complete waste of time.

If you have previously rated this thread, rating it again will delete your previous rating.

If you do not want to rate this thread, but want to see how others have rated it, then click the button without entering a rating, or else click here.

RotoGuru Baseball Forum

View the Forum Registry




Post a reply to this message: (But first, how about checking out this sponsor?)

Name:
Email:
Message:
Click here to create and insert a link
Ignore line feeds? no (typical)   yes (for HTML table input)


Viewing statistics for this thread
Period# Views# Users
Last hour11
Last 24 hours11
Last 7 days22
Last 30 days66
Since Mar 1, 20071118614