RotoGuru Baseball Forum

View the Forum Registry


Self-edit this thread


0 Subject: Attn: Web Programmers

Posted by: Guru
- [330592710] Wed, Mar 31, 2004, 09:55

In the Hoops Assimilator, I built in the capability to look up a TSN roster and load it directly into the Assimilator. I did this by having the Assimilator run a php script which reads in the latest frozen roster, parses the player data, and spits it out as JavaScript code in an Assimilator-friendly format.

I've started to program the same routine for the baseball Assimilator, but I have run into a snag. Apparently, TSN will not allow baseball rosters to be accessed unless the browser is logged in. In Hoops, anyone (whether logged in or not) could read a frozen roster page. But in baseball, unless you are logged in (presumably evidenced by the appropriate cookie), you cannot access the comparable page.

Since the php script is being run on the server side, there is no ability (that I know of) to log in and remain in a "logged in" state. However, my experience with this sort of issue is very limited.

Some of you develop similar applications which mine data from TSN roster pages, so I thought I'd see whether you might have an elegant solution to this dilemma.

The only workarounds that I've thought of would be to have the Assimilator access the TSN roster page from the client side. From there, I'd have to either pass the source code to a parsing routine (how could this be done?), or else identify the data directly within the Assimilator JavaScript code - probably via surveying the various embedded link objects in search of a recognized syntax.

Do any of you programming geeks have any thoughts as to the best way to accomplish this mission? Is there a way to effectively have the server recognized as being logged in? Or if not, does the source code of a page exist as a browser/JavaScript object or environment variable that could be passed to the server?

Thanks for your ideas.
1Keith
ID: 452482011
Wed, Mar 31, 2004, 10:26
Guru,

You could probably do this if you started playing with the HTTP stream directly. I don't know how much you know about the protocol, but if you either created custom packets or had a module that let you manipulate packets (i.e. appending cookie values), you should be able to send an HTTP request with the proper authentication attached to it, and receive an HTTP response that you could strip for the data you are looking for.

Keith
2Guru
ID: 330592710
Wed, Mar 31, 2004, 10:44
I don't know how to do that.
3penngray
Sustainer
ID: 423241723
Wed, Mar 31, 2004, 10:48
guru, Im working on my web based parsing for all rosters. It has changed from last year but I think I should have a solution by the weekend. Of course Im doing a lot more server side scripting using python so I think its a little easier for me.
4KrazyKoalaBears
Leader
ID: 517553018
Wed, Mar 31, 2004, 11:06
Guru, back in the day of TeamTracker (3-4 years ago?), I ran into a similar problem and never found a solution for PHP. I would certainly be interested in any solution someone may find.
5Guru
ID: 330592710
Wed, Mar 31, 2004, 11:27
It looks like my JavaScript-based solution won't work. I can read the roster page into a separate frame, but because it is from a different domain, I am unable to use JavaScript to access any data from it in other frames.
6Guru
ID: 330592710
Wed, Mar 31, 2004, 11:31
penngray - I have no experience with Python, but my server apparently supports it, so perhaps your solution (if you figure one out) would be useful.
7Keith
ID: 452482011
Wed, Mar 31, 2004, 11:55
Hi Guru,

I'm not a PHP expert, but I know HTTP pretty well. I believe the function you need to use if fsockopen, to open a remote connection (in this case, at sportingnews). Here's some documentation on the function: http://us4.php.net/fsockopen. You want to manipulate the "Get" request you send to TSN, and it is imperative that you put the cookie values in the right place at the top of the header. Once you get it working, it will be easy to parse the HTML you receive. Here's another link that should help: http://www.webreference.com/programming/php/cookbook/chap11/1/. I can probably put some code together if you need, but I really am not a PHP expert..

Keith
8penngray
Sustainer
ID: 423241723
Wed, Mar 31, 2004, 12:30
Guru, Great thing about python is that is an easy to use, free and it comes with all linux platforms. PHP is locked into the web development, python can be used server sided completely separate if it needs to be because its a true programming language. When and if you ever have time take a look at the product for future use.


9youngroman
ID: 59242611
Wed, Mar 31, 2004, 12:48
i'm a java-junkie, so i can only deal with a working solutions in java:

public class TsnBasicBaseball {
public static void main(String[] args){
try {
HTTPClient.HTTPConnection con = new HTTPClient.HTTPConnection("fantasygames.sportingnews.com");
con.setAllowUserInteraction(false);
HTTPClient.HTTPResponse resp = con.Get("/crs/home_check_reg.html?username=xxx&password=yyy");
resp = con.Get("/baseball/season1/basic/game/frozen_roster.html?user_id=zzz&table_id=2");
byte[] data = resp.getData();
} catch(Exception exc) {
exc.printStackTrace();
}
}
}

where xxx is username, yyy is password and zzz is teamid.
the HTTPClient-classes can be found here: HTTPClient
10youngroman
ID: 59242611
Wed, Mar 31, 2004, 12:53
just found this: HTTP_Request package for PHP
11Guru
ID: 330592710
Thu, Apr 01, 2004, 11:52
Just a brief update:

I'm currently fiddling with XML as a possible way to import and manipulate the roster data on the client side. I'm still not sure it will work, but early tests are encouraging enough that I'm going to continue to plug ahead.

The various server side solutions mentioned above might turn out to be better alternatives, but they also may be too technical for my experience level. However, if someone can develop a server-side script that accepts a team ID# and returns the related frozen roster page, I might be able to simply plug it in. My server is unix based, and supposedly supports the following: perl 5.6.0, C/C++, Java JDK 1.4, Python 1.5.2 and Python 2.1.1 & TCL 8.3 Interpreters
12 youngroman
ID: 221118186
Fri, Apr 02, 2004, 06:53
here is a working example with jsp. all you need is the HTTPClient linked above.
place the zip-file in your classpath (usually the WEB-INF/lib directory) and save the following code as tsn.jsp (change [html]/[body]/[br] to real html-tags)

this jsp-code can be used anywhere in your html-code


[html]

[body]
<%
// calling url is http://127.0.0.1:8080/tsn.jsp?user=youngroman&pass=xxxxx&id=161

// necessary parameters
String user = request.getParameter( "user" );
String pass = request.getParameter( "pass" );
String id = request.getParameter( "id" );

// get a connection to tsn
at.wave.util.httpclient.HTTPConnection con = new at.wave.util.httpclient.HTTPConnection("fantasygames.sportingnews.com");

// set proxy (if necessary)
con.setProxyServer("proxy", 8080);

// allow auto-cookies
con.setAllowUserInteraction(false);

// login
at.wave.util.httpclient.HTTPResponse resp = con.Get("/crs/home_check_reg.html?username=" + user + "&password=" + pass);

// get roster
resp = con.Get("/baseball/season1/basic/game/frozen_roster.html?user_id=" + id + "&table_id=2");

// get response
byte[] data = resp.getData();
String site = new String(data); // convert response to string (for following indexOf)

String[] players = new String[8]; // number of players in BasicBaseball
String[] ids = new String[8]; // tsn-ids of players in BasicBaseball
int playerCount = 0;
int pos = site.indexOf("player_id"); // search for string "player_id" at position 0
while(pos != -1) {
int posStart = site.indexOf('>',pos); // start of player-name
int posEnd = site.indexOf('<',posStart); // end of player-name
ids[playerCount] = site.substring(pos + 10, posStart - 1); // player-id
players[playerCount++] = site.substring(posStart + 1, posEnd); // player-name
pos = site.indexOf("player_id",posEnd); // search next
}
%>
<%
// print ids and player-names
for(int i = 0; i < playerCount; i++) {
%>
<%= ids[i] %>
<%= players[i] %>
[BR]
<% } %>
[/body]
[/html]
13Guru
ID: 330592710
Fri, Apr 02, 2004, 09:00
youngroman - I have no idea what your post [12] says.

1. Where is the working example? Does the java code in post [9] relate to this?

2. On what computer do I put the HTTPClient files(s) linked above? My home PC? Or the server which will deliver the page? I can find no WEB-INF/lib directory on either machine. (I assume it must be the server, or else this solution could hardly be useful to a wider community of users.)

3. What is the purpose of the simple HTML code you list?


Sorry to be dense about this, but I guess your instructions are too cryptic for me.



14RecycledSpinalFluid
ID: 4134319
Fri, Apr 02, 2004, 10:37
Guru, view the source for the page, for the real story. A case of the "disappearing code".
15youngroman
ID: 59242611
Fri, Apr 02, 2004, 11:06
1. the code in 12 is the working example (at least on my local pc's).

2. all files are needed on the server

HTTPClient.zip (or better rename it to HTTPClient.jar) - copy this file to your java-classpath-directory, in tomcat this would be the web-inf/lib-dir of your web-application, i don't know how the directory is named on your webserver
tsn.jsp - copy this file to a directory you can access with your browser

the jsp-file can be found here: tsn.jsp

3. the html shows only the wrap-around for the jsp. i only want to demonstrate how it could be done. the real html-file would be the left frame of the assimilator
16penngray
Sustainer
ID: 423241723
Fri, Apr 02, 2004, 17:06
guru, just starting on my stuff.

Im writing a non-web based solution because its going to run as a scheduled service every morning.

But in general I believe even in PHP you should be able to do the following

send an HTTP GET using this link

http://fantasygames.sportingnews.com/crs/home_check_reg.html?

passing in the arguments username=guru&password=yourpassword

Once that is done you should be able to get to any roster.

Sorry I dont know the general PHP commands but I think the above logic should be avaliable.
17Guru
ID: 330592710
Fri, Apr 02, 2004, 17:24
Nope, that works from the client side where the browser can preserve the logged in state (via cookies, I presume). But that does not work from the server side - or at least I haven't figured out how. That was the first thing I tried. But when I access that login URL, I don't even get back to a logged in "My account" page.

I'm unable to make any headway with youngroman's java solution. I can't even tell if my server really does support java, as I'm getting absolutely nowhere, and my lack of knowledge severely limits my ability to debug the process.

I can get at the roster page from the client side using xml, and I think I can even process the data successfully from there. It does require a relaxation of default browser security, enabling the access of an external domain. This can be done in MSIE through a a security setting adjustment, but Netscape (or Mozilla) appears to be inflexible on that front. If this turns out to be my only workable solution, then I'll pursue it, but I'm still hoping that I can figure out a better approach that would work in all standard browsers.

The PHP HTTP request package linked in [10] still might offer a solution, but I got bogged down in the technical minutiae and have not gone back to that path yet.

I have a feeling I'll get there someday, but it may be a slow, tedious process. Next week, I'll porbably work on PSC and Swirve versions first, as they will not have the same security obstacles.
18penngray
Sustainer
ID: 423241723
Fri, Apr 02, 2004, 17:45
hmmm...sorry. Im not doing a Client browser based solution but I will have to in the near future for something else Im doing.
19Guru
ID: 330592710
Fri, Apr 02, 2004, 18:19
I'd be happy with a server-side browser-based solution. In fact, that would probably be preferred.
20youngroman
Da Man
ID: 59242611
Sat, Apr 03, 2004, 05:52
here is the php solution (from 10):
php-solution

all needed php-files are inside the zip-archive. i tried it only from the command-line, but it should work from your server too.
24Guru
ID: 330592710
Sat, Apr 03, 2004, 09:57
Yee-haw! It works! You da man-youngroman!

I was starting to work down that same path, but probably would have gotten hung up when trying to manipulate the cookies.
25Perm Dude
      Dude
      ID: 30792616
      Sat, Apr 03, 2004, 10:39
Added the appropriate moniker to #20.

pd
26youngroman
      ID: 59242611
      Sat, Apr 03, 2004, 10:41
i have to say that looks not bad, i always wanted to be "da man"
Rate this thread:
5 (top notch)
4 (even better)
3 (good stuff)
2 (lightweight)
1 (no value)
If you wish, you may rate this thread on scale of 1-5. Ratings should indicate how valuable or interesting you believe this thread would be to other users of this forum. A '5' means that this thread is a 'must read'. A '1' means that this is a complete waste of time.

If you have previously rated this thread, rating it again will delete your previous rating.

If you do not want to rate this thread, but want to see how others have rated it, then click the button without entering a rating, or else click here.

RotoGuru Baseball Forum



Post a reply to this message: (But first, how about checking out this sponsor?)

Name:
Email:
Message:
Click here to create and insert a link
Click here to insert a random spelling of Mientkiewicz
Ignore line feeds? no (typical)   yes (for HTML table input)


Viewing statistics for this thread
Period# Views# Users
Last hour11
Last 24 hours11
Last 7 days22
Last 30 days77
Since Mar 1, 20071754739