SmartFunnyCool

Writing about code

Getting a Date with Wolfram|Alpha (Part I)


This will be the first part in a series of posts about using the Wolfram|Alpha API to parse an English phrase and return a date or range of dates for a historical event, figure, or piece of popular culture. In this post I will discuss the basic functionality of Wolfram|Alpha and how the ruby gem 'wolfram-alpha' encapsulates a lot of this. In later posts I will show improvements on this gem to produce search results that are easier to categorize.

Wolfram|Alpha

Wolfram|Alpha is the website that would have done your high school homework for you. Mostly simply, you go to www.wolframalpha.com and enter a question about math, science, social studies, English, or anything else that would have tortured your 15 year old self and Stephen Wolfram (pictured above, left) and his associates (pictured above, right) attempt to answer it. In his own words:

We aim to collect and curate all objective data; implement every known model, method, and algorithm; and make it possible to compute whatever can be computed about anything. Our goal is to build on the achievements of science and other systematizations of knowledge to provide a single source that can be relied on by everyone for definitive answers to factual queries.

And it works too. A simple query asking about the lorenz attractor yields the following:
You could also ask about food like products:
.

But my main concern is with using Wolfram to find calendar dates relating to historical figures
and historical events
.

Wolfram|Alpha API Output

Wolfram|Alpha organizes its output into regions called pods and within each pod are one or more subpods. For the result above the pods "Input Interpretation", "Basic Information", "Album Art", and "Tracklist" are displayed. Each of these pods only has one subpod displaying the relevant information for the pod's topic. In addition to pods, Wolfram|Alpha makes queries based on assumptions that can be manipulated to get better results (more on this in a later post). The Wolfram API allows developers access to the pods both as image files and as plaintext. So the first two pods from the API call for the above result looks like the following:

This XML file does not appear to have any style information associated with it. The document tree is shown below.  
<queryresult success="true" error="false" numpods="7" datatypes="Country,MusicAlbum,MusicAlbumRelease,MusicArtistCredit,WikipediaStats" timedout="" timedoutpods="Word cloud" timing="9.481" parsetiming="0.204" parsetimedout="false" recalculate="" id="MSPa25621dgi4hbdd6igf9ch00001bgaa08i9ef400e3" host="http://www4b.wolframalpha.com" server="24" related="http://www4b.wolframalpha.com/api/v2/relatedQueries.jsp?id=MSPa25631dgi4hbdd6igf9ch00001be2g1eb28h40h55&s=24" version="2.6">  
<pod title="Input interpretation" scanner="Identity" id="Input" position="100" error="false" numsubpods="1">  
<subpod title="">  
<img src="http://www4b.wolframalpha.com/Calculate/MSP/MSP25641dgi4hbdd6igf9ch00003ag49425d2ai0f57?MSPStoreType=image/gif&s=24" alt="Appetite for Destruction (music album)" title="Appetite for Destruction (music album)" width="260" height="18"/>  
<plaintext>Appetite for Destruction (music album)</plaintext>  
</subpod>  
</pod>  
<pod title="Basic information" scanner="Data" id="BasicInformation:MusicAlbumData" position="200" error="false" numsubpods="1">  
<subpod title="">  
<img src="http://www4b.wolframalpha.com/Calculate/MSP/MSP25651dgi4hbdd6igf9ch0000231b9i7d9i41fgfc?MSPStoreType=image/gif&s=24" alt="name | Appetite for Destruction artist | Guns N' Roses release date | July 7, 1987 runtime | 53 minutes 48.7 seconds" title="name | Appetite for Destruction artist | Guns N' Roses release date | July 7, 1987 runtime | 53 minutes 48.7 seconds" width="282" height="132"/>  
<plaintext>  
name | Appetite for Destruction artist | Guns N' Roses release date | July 7, 1987 runtime | 53 minutes 48.7 seconds  
</plaintext>  
</subpod>  
<infos count="1">  
<info>  
<link url="http://www.amazon.com/gp/product/B000000IVW" text="Buy from Amazon"/>  
</info>  
</infos>  
</pod>  
...

Wolfram-Alpha's Ruby Gem

Luckily the 'wolfram-alpha' gem provides a ruby data structure to the logic of pods and supbods. Given the heterogeneity of Wolfram|Alpha output it does not do much more than provide ruby methods for querying and traversing the pods and subpods. But for a simple request it is fairly easy to use.

Below is a short program I wrote that queries Wolfram|Alpha and returns an array of all pieces of date information in the basic information pod.

require 'wolfram-alpha'

class GetDate

  def initialize(SECRET_KEY)
    options = { "format" => "plaintext" }
    @client = WolframAlpha::Client.new(SECRET_KEY, options)
  end

  def get(event)
    response = @client.query(event)
    basic_info = response.pods.find{ |pod| pod.title.downcase.include?("basic")}
    basic_info.subpods.map do |subpod|
      subpod.plaintext.split("\n").select{ |info| info.include?("date")}
    end.compact.flatten
  end

end

date_finder = GetDate.new("SECRET_KEY")  
puts date_finder.get("appetite for destruction").inspect

// $ ruby getdate.rb 
//-> ["release date | July 7, 1987"]

The client is used to establish a connection to Wolfram|Alpha using my unique developer id. This generates a response object that contains a number of pods. I then found the "Basic Information" pod and returned all entries that include the word "date".

In a later post I will attempt to encapsulate these dates in ruby Date objects. While the Date.parse() method does a great job of parsing strings such as "July 7, 1987" not every response is structured as such. Often Wolfram returns a range of dates in a single string or as two strings. Learning how to predict the output will depend on learning to better understand Wolfram|Alpha's assumptions. And this will require making some changes to the ruby gem so that it returns and manipulates assumption information.

Sources

  • The wolfram-alpha ruby gem is bare-bones but easy to get started with.
  • The Wolfram|Alpha API documentation is comprehensive and very clearly written.
Share this post: