Geek Matt: August 2010

Scrape Powerball Number Frequency

>> Monday, August 30, 2010

I know that buying lottery tickets is a total waste of money. I also know that there is nothing you can do to increase the odds of hitting a Powerball jackpot (1 in 195,249,054). However, just for fun, I thought I'd fire off a quick groovy script that will read the frequencies from the Powerball website.

Once I have all the numbers I could sort the lists to find out which numbers are most frequently picked (or least frequently picked). I'm sure there is other statistical analysis that you could do too if you wanted. You could also watch the frequency trends over time (though this historic list of numbers may be better at that)

Here's the simple script:

import org.cyberneko.html.parsers.SAXParser;

//initialize
def maxPowerBall = 39
def whiteBallFreq = [:]
def pbFreq = [:]

//read the numbers
def url = 'http://powerball.com/powerball/pb_frequency.asp'
def html = new XmlSlurper(new SAXParser()).parse(url)

//create the lists from the HTML
html.BODY.TABLE.TBODY.TR[3].TD[1].TABLE.TBODY.TR[4].TD[1].TABLE.TBODY.TR[2..-2].each{nextRow ->
 def cols = nextRow.children()
 def ball = Integer.parseInt(cols[0].text())
 
 whiteBallFreq[ball] = parsePossiblyBlankNumber(cols[1].text())
 if(ball <= maxPowerBall){
  pbFreq[ball] = parsePossiblyBlankNumber(cols[2].text())
 }
}

def int parsePossiblyBlankNumber(String value) {
 if(value.trim().length() == 0){
  return 0;
 } else {
  return Integer.parseInt(value)
 }
}

The import is found here. The long "html.BODY.TABLE..." line is needed because of how the Powerball website is set up.

I put in the parsePossiblyBlankNumber method since the website uses blanks instead of 0s when there are no frequencies.

Leave a comment if you have any questions or suggestions!

Code Formatting in Blogger Using SyntaxHighlighter

>> Friday, August 20, 2010

I wanted to use SyntaxHighlighter, but it turns out it takes a little tweaking on Blogger. I found a number of blogs that had instructions, but none of them seemed to work. Here is what I did to get the code formatting.

1. Upload files to Google Sites (Optional)
You need to link to a few javascript and css files. It looks like SyntaxHighlighter has a hosted version, but I decided to host the specific files I wanted. Here are the files you need:

shCore.js
shCore.css
shCoreDefault.css (or your theme, I used shCoreEclipse.css)
Brushes (for type of code). I used shBrushPlain.css, shBrushJava.css, and shBrushGroovy.css

2. Edit your Blogger template
Go to Design > Edit HTML. Add this code towards the bottom:

The "trick" for blogger is to set the "bloggerMode" to true.

3. Wrap your code in "pre"
For example, if I wanted to use blog about this code in Java, I would use class="brush: java" like this:



public static void main(String[] args){
 System.out.println("Hello World!");
}

And it would look like this:


public static void main(String[] args){
 System.out.println("Hello World!");
}

Fantasy Football Cheat Sheet Scraper

>> Wednesday, August 18, 2010

With my fantasy football draft coming up, I thought I would create a quick Groovy script to scrape a couple Fantasy Football cheat sheets. If you don't know, a cheat sheet has a ranking of players that you can use to make your draft picks. There are a bunch available online, and they are updated frequently (like now that Brett Favre is back with the Vikings I assume he will move up and Tarvaris Jackson will move down in rank).

Since I wanted the most up-to-date ones on the day of my draft, I wrote a couple classes in Groovy to read the webpages and print out the results in a common format. This turned out the be pretty easy to do.

For example, here is one from http://www.fftoolbox.com:


import org.cyberneko.html.parsers.SAXParser;

class FFToolbox {
    public List getPlayerList(){
        def playerList = []
        
        //read the XML
        ['http://www.fftoolbox.com/football/2010/overall.cfm?page=1',
                'http://www.fftoolbox.com/football/2010/overall.cfm?page=2',
                'http://www.fftoolbox.com/football/2010/overall.cfm?page=3',
                'http://www.fftoolbox.com/football/2010/overall.cfm?page=4'].each{url ->
                    def html = new XmlSlurper(new SAXParser()).parse(url)
                    def table = html.BODY.'**'.findAll{ it.name() == 'TABLE'}[0]
                    def rows = table.TBODY.children()[1..-2]
                    
                    //convert to my BO
                    rows.each{ nextRow ->
                        def columns = nextRow.children()
                        playerList.add(new CheatSheetEntry(
                                rank: columns[0],
                                name: columns[1],
                                position: columns[2],
                                team: columns[3],
                                byeWeek: columns[4])
                                )
                    }
                }
        return playerList
    }
}

Note that the import is from here and provides an HTML parser

Here is another example from ESPN.com:


class ESPN {
    public List getPlayerList(){
        def playerList = []
        
        def url = 'http://sports.espn.go.com/fantasy/football/ffl/story?page=NFLDK2K10rankstop200'
        def html = new XmlSlurper(new SAXParser()).parse(url)
        
        def table = html.BODY.'**'.findAll{ it.name() == 'TABLE'}[0]
        def rows = table.TBODY.children()
        
        rows.each{nextRow ->
            def columns = nextRow.children()
            def positionMatcher = columns[3].text() =~ /\D+/
            playerList.add(new CheatSheetEntry(
                    rank: columns[0],
                    name: columns[1].text().substring(0, columns[1].text().indexOf(",")),
                    position: positionMatcher[0],
                    team: columns[1].text().substring(columns[1].text().indexOf(",") + 2),
                    byeWeek: columns[2])
                    )
        }
        
        return playerList
    }
}

Basically they do the same things: Connect to and parse the url(s), find the table with the cheat sheet, and then go through the rows and columns assigning the correct values to a GroovyBean called CheatSheetEntry.

Now I just need to combine the lists somehow to get an average ranking of the players.

Linkbar

Scrape Powerball Number Frequency

>> Monday, August 30, 2010

Code Formatting in Blogger Using SyntaxHighlighter

>> Friday, August 20, 2010

Fantasy Football Cheat Sheet Scraper

>> Wednesday, August 18, 2010

About Me

Labels

Blog Archive

Links