Stolen from some friends of mine on Facebook:
Sunday, October 31, 2010
Saturday, October 30, 2010
Friday, October 29, 2010
Danny Boy
A truly moving take on the classic Irish tune.
P.S. Turns out the Latin sentence I translated for a blog post yesterday was the one I had to translate on the test today! Bam.
P.S. Turns out the Latin sentence I translated for a blog post yesterday was the one I had to translate on the test today! Bam.
Thursday, October 28, 2010
Latin Insanity
While dutifully studying for my Latin test tomorrow, I realized I needed to make a blog post. Lacking any better ideas, I decided to make it about some of the absurd things those Ancient Romans did that I'm struggling with 2000 years later.
Wednesday, October 27, 2010
I Am Batman
(I failed to update my blog for yesterday! While I sit here contemplating my wrongs, have a funny picture)
Monday, October 25, 2010
My Professional Acting Debut
Pay close attention at 0:35, you can just kinda see me in the background.
Sunday, October 24, 2010
Stephen Fry on Linguistic Pedants
Stephen Fry (aka Jeeves of Jeeves and Wooster fame) is one of those people whose voice is so pleasant to listen to that I could probably happily listen to him talk about anything; Morgan Freeman is another. Luckily, this excerpt from one of his podcasts (full version here) is actually quite fascinating, and I agree with him wholeheartedly. On top of that, someone has made an excellent video to go along with it, using a technique called Kinetic Typography if you want to be fancy, or "moving text" if you don't.
(original)
(original)
Gordon Goodwin and Take 6 - Comes Love
Here's a cool video from the recording session for Comes Love from the album XXL by Gordon Goodwin's Big Phat Band. This song features the absolutely fantastic jazz vocal group Take 6.
Saturday, October 23, 2010
Zelda Reorchestrated And Other Assorted Videogame Music
I've been going back through some of my music that I've had for a long time but haven't listened to much, and discovered a great set of music produced by a project called Zelda Reorchestrated (or ZREO). They've taken songs from the classic Legend of Zelda games and spruced them up, rearranging them using some very high quality orchestral samples. Read on for some samples of my favorites.
Friday, October 22, 2010
lingua latina utilissima est
I've been taking Latin on and off for a long time now (7 years? Wow), but I'm only recently starting to appreciate how much it's taught me about English grammar. I'm taking an intro to linguistics class this semester and we've left phonology and morphology behind in the last week and started talking about English grammar and syntax, a horribly complex beast. However, I've noticed that a lot of things that are stumbling blocks for most of the class seem pretty simple to me because of what I've learned from Latin.
For example, take a sentence like "this dog is bored". In order to determine the syntactical structure of this (simple) sentence, we need to know what parts of speech all the words are. "This" is what's called a determiner (not really important), "dog" is a noun, "is" is a verb, and "bored" is..... what?
For example, take a sentence like "this dog is bored". In order to determine the syntactical structure of this (simple) sentence, we need to know what parts of speech all the words are. "This" is what's called a determiner (not really important), "dog" is a noun, "is" is a verb, and "bored" is..... what?
Thursday, October 21, 2010
My Sleep Habits: Fun With Facebook, Part 3.5
(Previous post here)
I'm not making this a full post yet, because I don't feel like writing much but I still want to share this cool result. Basically, I got a working HTML parser for Ruby and am beginning to create a framework for analyzing the Facebook data stored in HTML. My first interesting result is this graph for frequency of wall posts (including status updates) by hour, in military time:
Interesting conclusions to draw here:
I'm not making this a full post yet, because I don't feel like writing much but I still want to share this cool result. Basically, I got a working HTML parser for Ruby and am beginning to create a framework for analyzing the Facebook data stored in HTML. My first interesting result is this graph for frequency of wall posts (including status updates) by hour, in military time:
Interesting conclusions to draw here:
- Facebook seems to indicate that I'm asleep before 2:00 the vast majority of the time.
- The drop for the 15 hour (3:00-4:00) PM is probably related to going home at the end of the high school day, which ended at 3:50.
- The other drop at 19 (7:00-8:00) PM is probably because that's when most people eat dinner.
Coding this took almost no time, so I'm super pleased with the cool results I got out of just this simple test. I'll be able to refine this particular dataset later once I can take comments into account too.
Wednesday, October 20, 2010
Lions, iLifes, and Airs, Oh My!
Apple had a big PR event today entitled "Back To The Mac" that just wrapped up a few hours ago. The big-ticket items were:
- iLife '11
- Mac OS 10.7, codenamed Lion
- New Macbook Airs
If only they hadn't released Tiger five years ago, I would have had the perfect headline for this post. Alas.
Anyway, I was eagerly following Engadget's liveblog as it went down, and here are my basic first impressions.
Anyway, I was eagerly following Engadget's liveblog as it went down, and here are my basic first impressions.
Tuesday, October 19, 2010
Sintel
Sintel is a great short indie film that you can watch on YouTube here. Amazingly, it was produced entirely with Blender, a free and open source 3D application. It's only 15 minutes long, but it packs quite an emotional punch. Check it out!
(Thanks to Andy at Everyday Nitrocellulose for the source.)
Monday, October 18, 2010
Fun With Facebook, Part 3
(Previous post here, and source code for this post here)
As I mentioned before, my method for generating searches to find names was really bad, because I was searching the whole space of strings and wasting a lot of time on searches that found no results, like "zx" or "qq". Last time, it took a little over two hours to generate only the useful searches that were 5 letters long. With my new, non-terrible code, it takes under 10 seconds to generate all 8348 search strings of any length. That's a pretty massive improvement. How massive?
Well, there are a total of 8348 search strings of any length that will produce at least one result. The actual alphabet they drew from is bigger than just "a" to "z", because some people's names have special symbols, like dashes for compound last names, but we'll just assume that they drew from "a" to "z" in order to get a lower bound. The longest search string that gets results is 13 letters long, so the space of all possible search strings I would have had to look through using the old method is the sum of all the spaces for each length, i.e. all the 1 letter strings plus all the 2 letter strings plus all the 3 letter strings, and so on up to 13. Computing that sum yields a whopping 2,580,398,988,131,886,038 possible searches. The 8348 searches that return results represent a rather small portion of those, approximately 3.2 * 10^-13%. In other words, if I'd run my bad program enough to find all the searches, only 0.00000000000032% of my time would have been productive.
Anyway, I wrote a new program that didn't suck, and you can read a description of how it works after I go over some of the results. Like I said before, there were 8348 search strings that produced any results at all. Of those, 5507 return exactly one result. Of course, a lot of those are redundant. For example, if "icho" were to only hit "Nicholas", then so would "Nicho", "Nichol", "Nichola", "ichola", and so on. If you then reduce the redundant results down to just the shortest one, we'll get a list of what we'll call "minimal searches": searches (1) that return only one result and (2) for which there are no shorter searches that return the same result. Condition (2) is worded a little bit awkwardly because there can be more than one minimal search of a certain length. For example, it could be that "ich" and "cho" are both minimal searches for "Nicholas", if they both only match "Nicholas" and there are no two-letter strings that match only "Nicholas".
With that concept in mind, we can answer some even more questions. For example, how many names have minimal searches at all? Shouldn't all of them? Interestingly, there are only 481 names that have minimal searches. Part of this oddity is a result of the way I've implemented the searching. One name that fails to have a minimal search is mine. It doesn't work because there are other people in my friends list named Nick and other ones with the last name Starr. The way I've implemented this so far, you can't have multiple words (separated by spaces) in a search, so you couldn't search something like "ck rr" to try and get "Nick Starr". With that capability, there wouldn't be any names without minimal searches, unless there were literal duplicates in the list.
Well, that's about all I can think of doing with a list of my friends' names. If only facebook had provided more info (like which of my friends were friends with each other), there would be all kinds of other things I could do. As it is, my next goal is to find a good HTML parsing library for Ruby and write some code to start messing with the other data, hopefully in a more organized way than my friends list code.
Appendix A: How The Better Algorithm Works
The gist of how I made my code better is to restrict it to only look at valid search terms, by working from the names themselves rather than looking at all the possible search strings. For every name I go through a process of generating all the search strings that will match it. For example, for Nick, I would generate the following:
n, i, c, k
ni, ic, ck
nic, ick
nick
For each one of those search strings, I would run it against the whole list of names, seeing which names are matched by "n", which ones are matched by "i", and so on. Fortunately, this is a textbook example for the use of a simple optimization technique known as memoization. If you start running this method for a few sample names, you'll see that there's a lot of duplication. For example, in my dataset the string "nic" will match "Nick", "Nicholas", "Nichole", "Nicole", and a few last names. This means that every time I run the process above on any of those names (most of which are repeated a few times throughout the list), my code would have to search the whole list for names to see what matches "nic", even though it's already done that before. The key of memoization is to save the results of a function that gets repeated a lot. Sometimes you have to be careful about which results to save, but this computation is small enough that I just saved all of them. Then you end up with something like this (in pseudocode)
Did we already search for this string?
If yes
Use the stored value instead of re-searching.
If no
Perform the search and then store the result for the future.
It may seem like a minor change, but memoization can often have a huge impact on a function's efficiency. In this case, it cut off about two thirds of my code's running time. Of course, memoization only works if the function being considered is what's called referentially transparent, which is a fancy way of saying that given the same input, it'll always return the same output no matter how many times you call it. Trivial examples of functions that are not referentially transparent would be one that returns a random number, or one that asks the user for input. In this case, if I were to change the underlying list of names that I'm working with, my memoized values would no longer be valid, because my searching function would no longer be referentially transparent.
As I mentioned before, my method for generating searches to find names was really bad, because I was searching the whole space of strings and wasting a lot of time on searches that found no results, like "zx" or "qq". Last time, it took a little over two hours to generate only the useful searches that were 5 letters long. With my new, non-terrible code, it takes under 10 seconds to generate all 8348 search strings of any length. That's a pretty massive improvement. How massive?
Well, there are a total of 8348 search strings of any length that will produce at least one result. The actual alphabet they drew from is bigger than just "a" to "z", because some people's names have special symbols, like dashes for compound last names, but we'll just assume that they drew from "a" to "z" in order to get a lower bound. The longest search string that gets results is 13 letters long, so the space of all possible search strings I would have had to look through using the old method is the sum of all the spaces for each length, i.e. all the 1 letter strings plus all the 2 letter strings plus all the 3 letter strings, and so on up to 13. Computing that sum yields a whopping 2,580,398,988,131,886,038 possible searches. The 8348 searches that return results represent a rather small portion of those, approximately 3.2 * 10^-13%. In other words, if I'd run my bad program enough to find all the searches, only 0.00000000000032% of my time would have been productive.
Anyway, I wrote a new program that didn't suck, and you can read a description of how it works after I go over some of the results. Like I said before, there were 8348 search strings that produced any results at all. Of those, 5507 return exactly one result. Of course, a lot of those are redundant. For example, if "icho" were to only hit "Nicholas", then so would "Nicho", "Nichol", "Nichola", "ichola", and so on. If you then reduce the redundant results down to just the shortest one, we'll get a list of what we'll call "minimal searches": searches (1) that return only one result and (2) for which there are no shorter searches that return the same result. Condition (2) is worded a little bit awkwardly because there can be more than one minimal search of a certain length. For example, it could be that "ich" and "cho" are both minimal searches for "Nicholas", if they both only match "Nicholas" and there are no two-letter strings that match only "Nicholas".
With that concept in mind, we can answer some even more questions. For example, how many names have minimal searches at all? Shouldn't all of them? Interestingly, there are only 481 names that have minimal searches. Part of this oddity is a result of the way I've implemented the searching. One name that fails to have a minimal search is mine. It doesn't work because there are other people in my friends list named Nick and other ones with the last name Starr. The way I've implemented this so far, you can't have multiple words (separated by spaces) in a search, so you couldn't search something like "ck rr" to try and get "Nick Starr". With that capability, there wouldn't be any names without minimal searches, unless there were literal duplicates in the list.
Well, that's about all I can think of doing with a list of my friends' names. If only facebook had provided more info (like which of my friends were friends with each other), there would be all kinds of other things I could do. As it is, my next goal is to find a good HTML parsing library for Ruby and write some code to start messing with the other data, hopefully in a more organized way than my friends list code.
Appendix A: How The Better Algorithm Works
The gist of how I made my code better is to restrict it to only look at valid search terms, by working from the names themselves rather than looking at all the possible search strings. For every name I go through a process of generating all the search strings that will match it. For example, for Nick, I would generate the following:
n, i, c, k
ni, ic, ck
nic, ick
nick
For each one of those search strings, I would run it against the whole list of names, seeing which names are matched by "n", which ones are matched by "i", and so on. Fortunately, this is a textbook example for the use of a simple optimization technique known as memoization. If you start running this method for a few sample names, you'll see that there's a lot of duplication. For example, in my dataset the string "nic" will match "Nick", "Nicholas", "Nichole", "Nicole", and a few last names. This means that every time I run the process above on any of those names (most of which are repeated a few times throughout the list), my code would have to search the whole list for names to see what matches "nic", even though it's already done that before. The key of memoization is to save the results of a function that gets repeated a lot. Sometimes you have to be careful about which results to save, but this computation is small enough that I just saved all of them. Then you end up with something like this (in pseudocode)
Did we already search for this string?
If yes
Use the stored value instead of re-searching.
If no
Perform the search and then store the result for the future.
It may seem like a minor change, but memoization can often have a huge impact on a function's efficiency. In this case, it cut off about two thirds of my code's running time. Of course, memoization only works if the function being considered is what's called referentially transparent, which is a fancy way of saying that given the same input, it'll always return the same output no matter how many times you call it. Trivial examples of functions that are not referentially transparent would be one that returns a random number, or one that asks the user for input. In this case, if I were to change the underlying list of names that I'm working with, my memoized values would no longer be valid, because my searching function would no longer be referentially transparent.
Sunday, October 17, 2010
WARNING: Ke$ha post
If terribly trashy yet catchy pop music isn't your thing, consider yourself forewarned. For the rest of us, Ke$ha has released a new single! It's called We R Who We R (yeah, I cringed at the "R"s too), and it's from her EP Cannibal that's set to be released in late November.
Wolfram Alpha... scrabble expert?
Yes, Wolfram Alpha, the "computational knowledge engine" better known for things like helping you cheat on your math homework, having incredibly detailed demographic data, and giving snarky answers to philosophical questions, now has a bunch of data on Scrabble.
The Friends List: Fun With Facebook, Part 2
(Note: I'll be uploading the source of all the scripts I'm writing for this project to pastebin, but I make no guarantees on the quality of the code. I'm just writing quick-and-dirty Ruby to get some results without a ton of regard for efficiency or readability. That'll come later.)
Source for this post
As I mentioned in my intro to this series, the friends list is the simplest info in this whole package - it's quite literally just a list of names in plain text (592 total for me). The first thing I did was just a basic frequency count, which came out basically how you'd expect.
Source for this post
As I mentioned in my intro to this series, the friends list is the simplest info in this whole package - it's quite literally just a list of names in plain text (592 total for me). The first thing I did was just a basic frequency count, which came out basically how you'd expect.
Saturday, October 16, 2010
Fun with Facebook, Part 1
(This is a continuation of this post)
So after a few hours, Facebook finally finished preparing the zip file with all my info and sent me a download link. They had some interesting security features to make sure that no one else would be able to download it. Not only did I have to re-enter my password, but I had to identify a bunch of my friends. They picked three pictures of each friend at random, and asked me to pick who it was from a list of names. Very cool method, and makes it unlikely that someone will get tricked into giving this info to someone else by a simple phishing scam or something.
So after a few hours, Facebook finally finished preparing the zip file with all my info and sent me a download link. They had some interesting security features to make sure that no one else would be able to download it. Not only did I have to re-enter my password, but I had to identify a bunch of my friends. They picked three pictures of each friend at random, and asked me to pick who it was from a list of names. Very cool method, and makes it unlikely that someone will get tricked into giving this info to someone else by a simple phishing scam or something.
Friday, October 15, 2010
Downloading Your Facebook Account
Facebook recently rolled out a a new feature: you can now download a zip archive of all the info associated with your account. Here's the full contents of the archive, according to Facebook's help pages:
Jukebox the Ghost - Good Day
(Crap, I missed Thursday! Here's a song while I think of something else for today to make up for yesterday)
I kind of discovered Jukebox the Ghost by accident. I've had an album of theirs (Let Live and Let Ghosts) for ages now, but haven't really listened to it much at all. A few weeks ago, I gave it another chance, and now I'm in love with it. Here's a video of them playing the opener from the album:
I'll admit to liking them partly because they feature the piano so much. The guy playing is really quite good, and his style reminds me a bit of Ben Folds.
I kind of discovered Jukebox the Ghost by accident. I've had an album of theirs (Let Live and Let Ghosts) for ages now, but haven't really listened to it much at all. A few weeks ago, I gave it another chance, and now I'm in love with it. Here's a video of them playing the opener from the album:
I'll admit to liking them partly because they feature the piano so much. The guy playing is really quite good, and his style reminds me a bit of Ben Folds.
Wednesday, October 13, 2010
Sesame Street + Old Spice = ???
I'm guessing many of you have seen this Old Spice ad from the Super Bowl. It's been parodied countless times, but I think this is my favorite one yet:
Tuesday, October 12, 2010
Dorm Room Engineering
Too bad W&M doesn't have an Engineering major, that might have been a good fit for me after all. Keep reading to see pictures of some of the useful things I've accomplished using totally unrelated items in my dorm room.
Cab Calloway and the Nicholas Brothers - Jumpin Jive
A great big band song and some mindblowing tapdancing.
Monday, October 11, 2010
What *isn't* Google doing?
Google. It started out as a research project written by two guys at Stanford, and now it's officially a word in the Oxford English Dictionary. Since then, it's branched out to all kinds of other services, some of which you've probably used: Google Maps, Gmail, Google Docs, etc. However, they're also branching out into all kinds of other projects, and at this point it looks like it's only a matter of time before they take over the world.
Sunday, October 10, 2010
Pink Floyd + The Beegees = Stayin' Alive In The Wall
Here's a great mashup of The Beegees' "Stayin' Alive" and Pink Floyd's "Just A Brick In The Wall (Pt 2)" (thanks, Kremer).
Saturday, October 9, 2010
Fall Break Adventures
So this weekend is Fall Break at W&M, which means no classes on Monday and Tuesday. Not much even compared to Thanksgiving Break, but hey, I'm not complaining! I decided to stay at school and relax, but my roommate went home with his family this morning. I woke up after they had left, and went out to run some errands.
Like a phoenix from the ashes
So, this blog kind of died out over the summer (because I got tired of trying to update it on tour), and I never really resurrected it when I got back to school. But that's all about to change! I'm going to give this whole "updating once a day" thing a shot again.
Subscribe to:
Posts (Atom)