RotoGuru Baseball Forum

View the Forum Registry

XML Get RSS Feed for this thread


Self-edit this thread


0 Subject: Live stats tool - new release

Posted by: Guru
- [330592710] Fri, Apr 16, 2010, 13:51

Today I'm releasing a new tool which will list players' stats during the day as games progress. Stats will be updated every three minutes.

The program should be considered a "beta" version for now, as I expect there will be a need for a number of tweaks and/or fixes. I'm also looking for feedback on formatting.

I've tried to make the output PDA-friendly, so that those who access this via cell phone or PDA device can easily read the data. To that end, the format may appear to be somewhat cryptic. If desirable, I could consider offering a separate output version for computers that have wider monitors - but let's first see if that seems desirable or necessary.

Here is the link:
http://rotoguru.net/cgi-bin/getstats.pl

As of this moment, all that shows are the players in the opening lineup of the afternoon game (all in red font, which denotes that the game has not yet started.) Once the game starts, those stats will update (and the font will turn blue). When the game is completed, the font will turn black.

All players are listed alphabetically. Pitchers are listed with a yellow background (a bolder yellow for starters, and a light yellow for relievers.)

I plan to use this as a basic engine for some new tools that I'll be developing over the coming months. For now, I just want to make sure that the data is accurate, and is presented in the best format.

Please provide feedback here. If you notice any mistakes, please point them out, so that I can investigate. If you have ideas to improve the output or make it easier to read or digest, I'm interested in hearing that too.
1Guru
      ID: 330592710
      Fri, Apr 16, 2010, 17:10
Perhaps a comment on the format would be helpful.

Here is a sample line:

Manzella, Tommy HOU@chc[F,2-7] 1/4 RBI SO-2

This is interpreted as follows:
name: Tommy Manzella
game: Hou at Cubs (player's team is in CAPS)
game is Final, Houston 2, Cubs 7

Monzella went 1 for 4 (1 hit in 4 AB)
He had one RBI and stuck out twice

In the 9th inning, just before the game ended, his line would have appeared as
Manzella, Tommy HOU@chc[9,2-7] 1/4 RBI SO-2
2youngroman
      Donor
      ID: 02934823
      Fri, Apr 16, 2010, 17:18
some observations:
1) I'd prefer the format [number][stat] instead of [stat]-[number]

example:
Paulino, Felipe HOU@chc[8,2-7] 6.0IP H-6 R-5 ER-5 BB-3 K-3
vs.
Paulino, Felipe HOU@chc[8th, 2-7] 6.0IP 6H 5R 5ER 3BB 3K

2) to identify the relief pitchers you need very good eyes.

3) what if you take a font where the characters are equal length and make it look like columns. something like:

Andrus, Elvis TEX@nyy[pre]
Arias, Joaquin TEX@nyy[pre]
Baker, Jeff hou@CHC[F,2-7] 0/1
Bartlett, Jason TB@bos [pre]
Beckett, Josh tb@BOS [pre]
Beltre, Adrian tb@BOS [pre]
Borbon, Julio TEX@nyy[pre]
Bourn, Michael HOU@chc[F,2-7] 0/4
Brignac, Reid TB@bos [pre]
Burrell, Pat TB@bos [pre]
Byrd, Marlon hou@CHC[F,2-7] 1/4 R 2B
Byrdak, Tim HOU@chc[F,2-7] 0.2IP H R ER
...

4) and finally one wish: is it possible to add a flag if the player is still in the game?
3Guru
      ID: 330592710
      Fri, Apr 16, 2010, 17:28
1)The problem with putting the number in front of the stat is for doubles and triples.

2 doubles doesn't work well as 22B

I considered using the number as a prefix except for doubles and triples, but didn't like the inconsistency.

Another thought was to use subscripts.

2 doubles could be 2B2

If I do that, should 2 Runs still be 2R, or R2



2) I'll try a slightly more colorful background for relievers.

3) Good suggestion. Let me fiddle with that.

4) Perhaps, but it's a tricky coding challenge to identify those players - although probably not impossible. Assuming I can identify those players, how would you suggest notating it?
4KrazyKoalaBears
      ID: 12353217
      Fri, Apr 16, 2010, 20:07
I'd echo youngroman about the format. But, I'd put the count ahead of the stat, so that it reads...

Paulino, Felipe HOU@chc[8,2-7] 6.0-IP 6-H 5-R 5-ER 3-BB 3-K

...which actually READS better as...

"6 innings pitched, with 6 hits allowed, 5 runs given up, 5 earned runs given up, 3 walks allowed, and 3 strikeouts."

It actually displays as the brain would read it. Your doubles would be 2-2B and is easy to read.

For #4, maybe do the reverse and indicate players who are OUT of the game. You could prepend them with an X and if you use a fixed-width font with the [pre] tag, you could leave that space empty for players still in the game and get...


Andrus, Elvis TEX@nyy[pre]
Arias, Joaquin TEX@nyy[pre]
X - Baker, Jeff hou@CHC[F,2-7] 0/1
Bartlett, Jason TB@bos [pre]
Beckett, Josh tb@BOS [pre]
Beltre, Adrian tb@BOS [pre]
Borbon, Julio TEX@nyy[pre]
Bourn, Michael HOU@chc[F,2-7] 0/4
Brignac, Reid TB@bos [pre]
X - Burrell, Pat TB@bos [pre]
Byrd, Marlon hou@CHC[F,2-7] 1/4 1-R 1-2B
Byrdak, Tim HOU@chc[F,2-7] 0.2IP 1-H 1-R 1-ER


...or, if you wanted to go with signaling players in the game, do something simple like a +, as in "still able to add more stats"...


+ Andrus, Elvis TEX@nyy[pre]
+ Arias, Joaquin TEX@nyy[pre]
Baker, Jeff hou@CHC[F,2-7] 0/1
+ Bartlett, Jason TB@bos [pre]
+ Beckett, Josh tb@BOS [pre]
+ Beltre, Adrian tb@BOS [pre]
+ Borbon, Julio TEX@nyy[pre]
+ Bourn, Michael HOU@chc[F,2-7] 0/4
+ Brignac, Reid TB@bos [pre]
Burrell, Pat TB@bos [pre]
+ Byrd, Marlon hou@CHC[F,2-7] 1/4 1-R 1-2B
+ Byrdak, Tim HOU@chc[F,2-7] 0.2IP 1-H 1-R 1-ER


I think I may like the second version better.
5Guru
      ID: 330592710
      Fri, Apr 16, 2010, 20:38
I just released an updated version, with some of the changes suggested above. Also corrected a few errors.

Still trying to decide the best format for stats>1.

6Alex
      ID: 3217239
      Sat, Apr 17, 2010, 07:16
Need a little bit of extra space between columns for cases where a team scores 10+ runs.
7Guru
      ID: 330592710
      Sat, Apr 17, 2010, 09:42
Updated version out - corrects column alignment and puts stat count in front of stat, per KKB's suggestion.

If a game has both teams in double digits, the stat alignment will still be slightly off for that row.

I'm not sure what will happen today with the completion of suspended game from last night. I suspect I'll pick up the entire stats for both the suspended game and the regularly scheduled game. That will undoubtedly require some modfication.
8Guru
      ID: 330592710
      Sat, Apr 17, 2010, 09:52
For days like today with a completion of a suspended game followed by a regular game, how should would the output be most useful?

Two lines for each player, one per game?
Stats consolidated from both games?
Or ignore the first game?

For the completion of the suspended game, I don't think I can show only the stats generated from the point of game resumption. It's all or nothing, in all likelihood.

Suppose it was a true doubleheader, instead of a suspended game completion. Would the answer be different?
9KrazyKoalaBears
      ID: 12353217
      Sat, Apr 17, 2010, 10:29
First off, it's looking really good. I'm finding that I can just glance at a player's line and quickly see/understand the stats.

A couple of "early" morning thoughts...

1. The matchup is a touch odd to me. I'm used to other sources having the home team in all caps. Thus, having the player's team in all caps doesn't translate as easily for me. I'm wondering if you might do something like...

Victorino, Shane PHI vFLA[F,6-8] 1/4 R BB
Votto, Joey CIN @PIT[F,3-4] BB

The one-space separation better shows the team of the player and then the team they're playing. I capitalized the opponent to allow for the "v". You could always just do...

Victorino, Shane PHI fla[F,6-8] 1/4 R BB
Votto, Joey CIN @pit[F,3-4] BB

But, again, at a quick glance it doesn't translate as easily to me. The use of "v" (or "vs") and "@" seem pretty standard.

2. Personal thing. I'm a fan of the three-letter team codes. Thus, TB > TAM, SF > SFO, SD > SDG, KC > KAN. But I understand I may be in the minority on that one. :)

3. Red font won't work well for color-blind people. Also, red is usually associated with "problem" or "error." I was thinking a gray font might better signify "these players are coming up, but aren't quite playing yet"...
Pregame lineups | Game in progress | Game complete

...which becomes...


Baker, Scott kc@MIN [F,3-10] 7.0IP 7-H 2-R 2-ER 6-K Win
Balfour, Grant TB@bos [pre]
Barajas, Rod NYM@stl[F,3-4] 1/4
Barden, Brian FLA@phi[8,6-8] 0/1
Barmes, Clint COL@atl[F,5-9] 0/2 R 2-BB
Bartlett, Jason TB@bos [pre]
Barton, Daric bal@OAK[F,2-4] 0/4 SO



4. For the suspended/postponed games and double-headers, I'd recommend a designator just before the stats. Maybe (1) for Game 1, (2) for Game 2, (S) for suspended game, (C) for continuation of a suspended game (I thought R for "resume", but it looks too much like P), and (P) for postponed. I considered it before the game info, but that would unnecessarily add space to other lines. You'll need to separate these games because of pitchers. If a relief pitcher comes in and gets a W in one game, but a L in the next or something else like that, it'll look weird to have "Win Loss" in the same line. Better to split it for clarity.

So, for last night, it would look like...

Baker, Scott kc@MIN [F,3-10] 7.0IP 7-H 2-R 2-ER 6-K Win
Balfour, Grant TB@bos [9,1-1] (S) 2.0IP H BB 2-K
Barajas, Rod NYM@stl[F,3-4] 1/4
Barden, Brian FLA@phi[F,6-8] 0/1
Barmes, Clint COL@atl[F,5-9] 0/2 R 2-BB
Bartlett, Jason TB@bos [9,1-1] (S) 0/4
Barton, Daric bal@OAK[F,2-4] 0/4 SO

... and if it were a double header day...

Baker, Scott kc@MIN [F,3-10] 7.0IP 7-H 2-R 2-ER 6-K Win
Balfour, Grant TB@bos [8,1-1] (2) 2.0IP H BB 2-K
Barajas, Rod NYM@stl[F,3-4] 1/4
Barden, Brian FLA@phi[F,6-8] 0/1
Barmes, Clint COL@atl[F,5-9] 0/2 R 2-BB
Bartlett, Jason TB@bos [F,3-1] (1) 0/4
Bartlett, Jason TB@bos [8,1-1] (2) 2/4 1-2B
Barton, Daric bal@OAK[F,2-4] 0/4 SO
10KrazyKoalaBears
      ID: 12353217
      Sat, Apr 17, 2010, 10:29
Ugh! Forgot to hit "yes" for ignore line feeds. My [pre] didn't show up properly. Sorry about that.
11JeffG
      ID: 47112621
      Sat, Apr 17, 2010, 11:38
Real useful tool. Definitely something I'll bookmark.

My two cents, just throwing brainstorm stream of consciousness ideas out, good and otherwise, for this version and 2.0 and beyond.


- My preference would be to see the quantity precede the stat label for all items.

- I'd even say no dash precedes the stat except for the 2B/3B stats label as in these examples:
Braden, Dallas bal@OAK[F,2-4] 7.0IP 3H 2R 2ER BB 4K Win
Werth, Jayson fla@PHI[F,6-8] 2/4 2R RBI 2-2B

- Maybe consider calculating the day's AVG OBP SLG ERA WHIP on the stat line for example:
Braden, Dallas bal@OAK[F,2-4] 7.0IP 3H 2R 2ER BB 4K Win 2.57ERA 0.57WHIP
Werth, Jayson fla@PHI[F,6-8] 2/4 2R RBI 2-2B .500AVG .500OBP .750SLG

- Maybe BOLD the team they play for instead of CAPS and always have it read home @ visitors.

- The position(s) that the player played that game be useful. Maybe with the Yahoo standard carat ^ next to it if they started. If so perhaps after the name before the teams.

- Since it is pseudo-real time, put the time accumulated next to the date until all games completed.

- Offer a position and team filter on the top

- Since you have inning number, how about top or bottom.

- Maybe for games not started (pre game lineups) you indicate the scheduled first pitch time where you'd have inning number.

- total pipe dream... I do not know if you'd be able to determine it from your stat sources, (You'd have one over EVERYONE if you could), but I always wanted to be able to see in my team stats if my fantasy pitcher was the pitcher of record if the game was still in progress.

- I'd maybe think about separating hitters and pitchers instead of dealing with highlighted rows. You could then put the pitcher's hitting stats among the hitting area.

- Since this is going to be the greatest one-stop stat source of all time, maybe have all the games current linescores listed after the players list.
12Guru
      ID: 330592710
      Sat, Apr 17, 2010, 22:38
Got the suspended/regular game issue resolved this afternoon. Wasn't able to get to much else today. Life intervenes.

Many of the above suggestions above are good, and I'll work to implement.

These are the things I plan to add in the next update (which should be on Sunday or Monday):
1. Change the red font to gray.

2. Adopt KKB's game nomenclature
TAM @BOS
STL vNYM

3. Change the stat count output so that the hyphen appears only for 2Bs and 3Bs

4. Add top or bottom of inning indication

5. Add scheduled start time (if it's easily accessible - still not sure how tricky that may be)

6. Add an "as of" time stamp to the date at the top

Things to ponder (more feedback invited):
1. Is it better to separate pitchers and hitters? Is there a reason so show pitcher hitting stats?

2. Is a position listing useful?

3. Is it useful to show ratios for the game, like AVG, ERA, etc? They are easy to calculate, but I don't find those calculations very useful for single games.

4. I really doubt if I can figure out who the pitcher of record is (for games in progress). Is that always even possible, if the raw input is simply a boxscore?


Features to be added somewhat later (assuming they are more difficult to program)
1. Indicator if a player is no longer in the game.
2. Indicator if a player started.
3. Add a game summary at the bottom (or somewhere).


Longer term features to include:
1. Ability to filter listed players
2. Ability to calculate fantasy points for a range of formulas (maybe even user specified)
3. Ability to have the report emailed to you at specific times.

These latter items will require some extra infrastructure, which will delay implementation for awhile. Might even make some of these features available only as a premium service. Will have to think about that - but if I can make this as useful and as flexible as I think I can, I think there may be an opportunity.
13Guru
      ID: 330592710
      Sat, Apr 17, 2010, 23:03
Hmmm...

I just realized that if I switch the nomenclature for the game as proposed, then the score becomes more confusing.

When I always show the visiting team first, then I can show the score that way too.

It's much simpler to interpret the score if it syncs with the order of the teams listed to the left.

I'm going to have to mull this one.
14astade
      Sustainer
      ID: 214361313
      Sat, Apr 17, 2010, 23:37
Random thoughts:

I'm not sure about using '@' for all matchups. It's not as intuitive using the CAPS to denote the team that the player is on. Putting their team first seems to be more intuitive.

Also, how about putting a separator (maybe '------') between players' whose last name starts with A and then B. Just to help breakup the sheet?

How about an export to excel feature that automatically does a text to columns so that it is easy to digest w/o having to manually do it?

Last one, what about having a hyperlink for each player so we can view their season stats? maybe a radio button for Yahoo!, ESPN, etc (depending on our preference?)

Otherwise, looks good. Thank you.

15Guru
      ID: 330592710
      Sat, Apr 17, 2010, 23:55
If the player's team is listed first, how should the score be listed? Always showing the visitor's score first? Or showing the player's team first?
16astade
      Sustainer
      ID: 214361313
      Sun, Apr 18, 2010, 00:02
I'm curious what others prefer. But here is an example:

Aardsma, David det@SEA[F,2-4] 1.0IP K Save

could be written:

Aardsma, David SEA vs. det [L,2-4] 1.0IP K Save

17KrazyKoalaBears
      ID: 12353217
      Sun, Apr 18, 2010, 12:13
Personally, I would match the score to the order of teams. Thus...

Aardsma, David SEA vDET [4-2] 1.0IP K Save

...because in my mind I can easily read that as, "David Aardsma pitched for Seattle at home versus Detroit and the team won 4-2 and..."

Using the same line and changing it for discussion...

Aardsma, David SEA @DET [2-5] 1.0IP K

...becomes, "David Aardsma pitched for Seattle at Detroit and the team lost 2-5 and..."

For me, having the player's team's score first is much more intuitive than having every matchup be "@".
18JeffG
      ID: 47112621
      Sun, Apr 18, 2010, 12:40
Just ideas. Maybe keep the score with the team, winning team (or team leading) listed first, home in caps, underline players team

Aardsma, David...... SEA 4 det 2 [F] 1.0IP K Save

Wright, David...... nym 2 STL 1 [B20] 1/6 3-BB 3-SO SB CS


Or use @ to specify home team, caps for player's team

Aardsma, David...... @SEA 4 det 2 [F] 1.0IP K Save

Wright, David...... NYM 2 @stl 1 [B20] 1/6 3-BB 3-SO SB CS
19Guru
      ID: 330592710
      Sun, Apr 18, 2010, 15:13
Update:

Pregame lineup font changed from red to gray.
Order changed to put all hitters first (alpha), then all starters, then all relievers.

I probably don't need the yellow background anymore, but I still have that formatting for the time being.
20Guru
      ID: 330592710
      Sun, Apr 18, 2010, 15:58
Update:

As of update time now shown on top line.

Stat counts only include hyphen for doubles and triples.
21Guru
      ID: 330592710
      Sun, Apr 18, 2010, 17:18
Update:

Team ID is not 3 characters long for all teams

Player's team is listed first, in CAPS

Opposing team is lower case, with '@' or 'v' separator.

Game score is in the same sequence as the team sequence.
22KrazyKoalaBears
      ID: 12353217
      Sun, Apr 18, 2010, 17:23
Looking GREAT, Guru! Much, much, much easier to "read" with all the changes you've implemented.
23Guru
      ID: 330592710
      Sun, Apr 18, 2010, 18:13
Update:

Top and Bottom inning indicators added.
Yellow backgrounds for pitchers eliminated.
24JeffG
      Dude
      ID: 01584348
      Mon, Apr 19, 2010, 11:23
Yellow backgrounds still showing in the color legend at the top.

Pedroia and Crawford both have sac bunts today 4/19. Not really crucial roto/fantasy info so I am not sure if those were bypassed by design, or missed. Just figured I'd bring to your attention.
25Guru
      ID: 330592710
      Mon, Apr 19, 2010, 11:51
Sac bunts: They were bypassed by design, but they would be easy to include.

Other stats I have not bothered to pick up:
LOB
SF
GIDP
Errors

There is no technical reason to exclude them, other than that most people probably don't care about them. Maybe LOB and GIDP are of no interest, but S, SF, and E should be included?
26Guru
      ID: 330592710
      Mon, Apr 19, 2010, 12:31
I can also include PO (picked off), although that abbreviation might be confused with putouts. I'll add it in for now.
27Guru
      ID: 330592710
      Mon, Apr 19, 2010, 13:18
Latest Update:
Added stats: S, SF, E, PO (runners and pitchers)
Added pitcher hitting stats. These are shown separately at the bottom.


The next iteration of updates will probably include:
1. Display starting time for games not yet started
2. Add indicators showing who started, and who is still in the game.

Those won't be incorporated until at least tomorrow, however. I need to attend to other issues for the rest of today - and ensure that the current version works properly.
28Guru
      ID: 330592710
      Tue, Apr 20, 2010, 20:14
Latest updates:

1. Added start time for pregame lineups
2. Listed (at bottom) games with no lineup posted yet
3. Show + if player is still in the game
4. Show ^ if player started

I'm not sure the +^ symbols are the best way to do this, but it is short and simple.


30Guru
      ID: 330592710
      Wed, Apr 21, 2010, 10:46
Comma delimeted file now available as output, which can easily be copied into a spreadsheet.

To get this version of output, simply append ?out=csv to the URL:
http://rotoguru.net/cgi-bin/getstats.pl?out=csv

Top line shows date
Second line shows column legend
Game-by-game player listing follows.

If there is a doubleheader, those games will be separately listed as game 1 and 2. If there is a completion of a suspended game, that will be listed separately as game 0.

For any prior game date, you can add date=mdd to the URL. For example, for April 17 stats in csv format:
http://rotoguru.net/cgi-bin/getstats.pl?date=417&out=csv

I suspect this is now the most statistically comprehensive and easy to access source of daily stats on the web.
31Guru
      ID: 330592710
      Wed, Apr 21, 2010, 11:14
A few more things I hope to add soon:

1. Include intentional walks. This would probably only appear in the csv version, but not the normal output.

2. Include position data. This will probably be a text string that looks like it does in a typical boxscore. For someone who starts at a position and plays there all game, it will just be the position. For someone who switches during the game, it will be something like SS-3B, or PH-RF. I don't know if I'll include this in the standard output or not, but I can certainly add it to the csv format.
32Guru
      ID: 330592710
      Wed, Apr 21, 2010, 13:23
Intentional walks have now been captured (for both hitters and pitchers), but do not appear in the standard output by default. If you want to see that data, then add IBB=1 to the URL.

Boxscore position data is now available. By default, it will also not appear in the standard output, as I didn't think most users would find it useful. But if you do want to see it, then append pos=1 to the URL. For example, here is the current day output with intentional walks and positions included:
http://rotoguru.net/cgi-bin/getstats.pl?IBB=1&pos=1

The position(s) are listed at the far right, {in brackets}. Starting position is listed first, as in a normal boxscore. Position is not listed for pitchers unless they appeared in some other role.

All data items will always be included in the csv version automatically.


As always, let me know if you notice any errors.
33astade
      Sustainer
      ID: 214361313
      Thu, Apr 22, 2010, 01:44
looks good. thanks, guru.
34Guru
      ID: 330592710
      Thu, Apr 22, 2010, 11:32
I just added the ability to sort hitters by position.

For this sort, starters are sorted by starting position. Subs are sorted by the first position listed in the boxscore. If that "position" is PH or PR, then that player is sorted in with the DH's. Also, by default, all outfielders are sorted together (i.e., not sorted by LF, CF, RF). Separate sorting within outfield is available as an option.

To sort by position, append sort=p to the URL.
To sort by position with outfielders sorted by field, append sort=po
To see the boxscore positions listed, you still need to include pos=1. (You can sort by position without displaying the full position for each player, however.)

For example, to see positions listed and to sort by position with outfielders grouped together, the URL is:
http://rotoguru.net/cgi-bin/getstats.pl?pos=1&sort=p
35Guru
      ID: 330592710
      Thu, Apr 22, 2010, 16:54
I added a few more options that will probably not be of interest to most. However, I also added a section at the bottom that lays out all of the various display options and parameters.
Rate this thread:
5 (top notch)
4 (even better)
3 (good stuff)
2 (lightweight)
1 (no value)
If you wish, you may rate this thread on scale of 1-5. Ratings should indicate how valuable or interesting you believe this thread would be to other users of this forum. A '5' means that this thread is a 'must read'. A '1' means that this is a complete waste of time.

If you have previously rated this thread, rating it again will delete your previous rating.

If you do not want to rate this thread, but want to see how others have rated it, then click the button without entering a rating, or else click here.

RotoGuru Baseball Forum

View the Forum Registry

XML Get RSS Feed for this thread


Self-edit this thread




Post a reply to this message:

Name:
Email:
Message:
Click here to create and insert a link
Click here to insert a block of hidden (spoiler) text
Click here to insert a random spelling of Mientkiewicz
Ignore line feeds? no (typical)   yes (for HTML table input)


Viewing statistics for this thread
Period# Views# Users
Last hour11
Last 24 hours11
Last 7 days88
Last 30 days1815
Since Mar 1, 2007257575255