Showing posts with label youtube. Show all posts
Showing posts with label youtube. Show all posts

Extract Audio from YouTube Videos



>> Tuesday, September 20, 2011

I wanted to get an mp3 of a video on YouTube. There are sites that will do it for you, but I wanted a little more control.


Turns out it was pretty easy with free software.

1) Download the flv video
I used the open-source YouTube Downloader from Sourceforge. With that Java application I just needed to drag the URL from my browser onto the application and it would start downloading. I could even download multiple videos at once.

2) Extract the audio
ffmpeg was all I needed in this case. The command I used was:
ffmpeg -i youtubevideo.flv output.mp3

Simple as that!

Read more...

Word Frequency for YouTube Videos



>> Wednesday, November 17, 2010

YouTube has a feature where you can browse the top viewed videos over a specific time-frame (today, this week, this month, or all time). I thought it would be interesting to see which words (if any) pop up more than others. By just glancing at the list I guessed that "justin" and "beiber" would top the list. I thought I'd write some quick Groovy code to see if I was right.

The Stats:

Here is what I found when I looked at the top 160 most viewed videos of all time (as of today):

Top 25 Words:

WordCountFreq
official193%
music121%
song71%
cyrus71%
miley71%
version60%
gaga60%
lady60%
bieber60%
justin60%
jason50%
feat50%
dance50%
love50%
baby50%
high40%
david40%
best40%
this40%
sean30%
nuki30%
iglesias30%
enrique30%
goes30%
like30%

Other Stats:
Total words: 619
Total unique words: 424

The Code:

All the source code is located here (box.net)

Here are the guts of the program:
def html = new XmlSlurper(new SAXParser()).parse(urlString)
html.'**'.findAll{ it.@class == 'video-title'}.each {nextVideo ->
//split the title using regex on non-word characters
nextVideo.text().split(/\W/).each{nextWord ->
def lowerCase = nextWord.toLowerCase()
//limit the results to "interesting" words
if(lowerCase.length() >= minWordLength && !(lowerCase in ignoreList)){
wordFreq[lowerCase] = wordFreq[lowerCase] == null ? 1 : wordFreq[lowerCase] + 1
}
}
}


Here is what it does:
1. Parse the YouTube URL using XmlSlurper
2. Find all the titles on the page
4. Split the title to get individual words
5. Convert it to lower case to make comparisons easier
7. Limit the words by a minimum length (to get rid of stuff like "of", "and", "the") and ignore other words (like "video")
8. Update the wordFreq map

I am not very comfortable with minimum length and ignoring words, but without that the top 10 words were: video, the, official, ft, music, i, t, you, in, on. That is much less interesting that the filtered list, in my opinion.

Read more...

  © Blogger template Webnolia by Ourblogtemplates.com 2009

Back to TOP