{"id":1345,"date":"2009-05-22T11:29:50","date_gmt":"2009-05-22T10:29:50","guid":{"rendered":"http:\/\/www.walkingrandomly.com\/?p=1345"},"modified":"2009-05-22T11:29:50","modified_gmt":"2009-05-22T10:29:50","slug":"getting-useful-data-out-of-wolfram-alpha-can-be-difficult","status":"publish","type":"post","link":"https:\/\/walkingrandomly.com\/?p=1345","title":{"rendered":"Getting useful data out of Wolfram Alpha can be difficult"},"content":{"rendered":"<p>Up until now I have been using Wolfram Alpha as the ultimate geek toy and have been truly delighted with it but I thought it was high time I tried to consider how one might use it more seriously.\u00a0 So I set myself a task.\u00a0 Nothing too complicated you understand , after all I am still finding my feet with this new system, but something that may at least possibly come up in the real world.\u00a0 The task I set myself was<\/p>\n<p><strong>Obtain the actual data points for the Gross Domestic Product (GDP) of the UK from 1970 to 1980 inclusive.\u00a0 To allow me to import this data into pretty much every analysis program on the planet I&#8217;ll want it in a CSV file of the form<\/strong><\/p>\n<pre>1970,GDP of UK for 1970\r\n1971,GDP of UK for 1971\r\netc<\/pre>\n<p>Should be easy huh?\u00a0 Wolfram Alpha knows all about the GDP of the UK &#8211; if I Wolf <a href=\"http:\/\/www.wolframalpha.com\/input\/?i=gdp+uk\">GDP UK<\/a> then I get the following output among other things).<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" class=\"aligncenter\" src=\"\/images\/walpha\/big\/walpha_GDPuk.png\" alt=\"Graph of UK GDP from Wolfram Alpha\" \/><\/p>\n<p>Fabulous! The data is clearly in there but how do I get it out in the form I want? Let&#8217;s try the hopeful <a href=\"http:\/\/www.wolframalpha.com\/input\/?i=UK+GDP+from+1970+to+1980\">UK GDP from 1970 to 1980<\/a>.\u00a0 Alas I get the now familiar &#8216;<em>Wolfram<span>|<\/span>Alpha isn&#8217;t sure what to do with your input.&#8217;<\/em> Moving on, I tried <a href=\"http:\/\/www.wolframalpha.com\/input\/?i=UK+GDP+1970+to+1980\">UK GDP 1970 to 1980<\/a> and <a href=\"http:\/\/www.wolframalpha.com\/input\/?i=UK+GDP+1970-1980\">UK GDP 1970-1980<\/a> but they didn&#8217;t work either.<\/p>\n<p>I can get at a single datum easily enough.\u00a0 <a href=\"http:\/\/www.wolframalpha.com\/input\/?i=UK+GDP+1970\">UK GDP 1970<\/a> gives me 123.7 billion for example but how do I get it to give me a list?\u00a0 Further experimentation showed me that I can get the GDP for any two years if I Wolf for something like <a href=\"http:\/\/www98.wolframalpha.com\/input\/?i=(UK+GDP+1970)+(UK+GDP+1971)\">(UK GDP 1970) (UK GDP 1971)<\/a>.<\/p>\n<p>By now I feel I am getting somewhere.  While playing with Wolfram Alpha (and reading the community forum) I&#8217;ve discovered that it will sometimes parse Mathematica code as well as plain English.  <em>&#8216;What I need<\/em>&#8216;, I thought,  &#8216;<em>is a piece of Mathematica code that would generate the query for me<\/em>&#8216;.  So I tried<\/p>\n<p><a href=\"http:\/\/www.wolframalpha.com\/input\/?i=Table[%22(GDP+UK+%22+%3C%3E+ToString[x]+%3C%3E+%22)%22%2C+{x%2C+1970%2C+1980}]\">Table[&#8220;(GDP UK &#8221; &lt;&gt; ToString[x] &lt;&gt; &#8220;)&#8221;, {x, 1970, 1980}]<\/a><\/p>\n<p>but that didn&#8217;t work but then I shouldn&#8217;t be surprised because Table turns out to be one of the Mathematica functions that Wolfram Alpha doesn&#8217;t parse.  Ho hum&#8230;<\/p>\n<p>I tried a LOT of different inputs but the practical upshot is that the only one that worked was <a href=\"http:\/\/www.wolframalpha.com\/input\/?i=(UK+GDP+1970)+(UK+GDP+1971)+(UK+GDP+1972)+(UK+GDP+1973)+(UK+GDP+1974)+(UK+GDP+1975)+(UK+GDP+1976)+(UK+GDP+1977)+(UK+GDP+1978)+(UK+GDP+1979)+(UK+GDP+1980)\">(UK GDP 1970) (UK GDP 1971) (UK GDP 1972) (UK GDP 1973) (UK GDP 1974) (UK GDP 1975) (UK GDP 1976) (UK GDP 1977) (UK GDP 1978) (UK GDP 1979) (UK GDP 1980)<\/a>.\u00a0 Lord help me if I wanted three times as many data points.<\/p>\n<p>For the record I can get exactly what I wanted in Mathematica 7 with the following two lines of code and I worked out how to do it with a moments thought.\u00a0 Wolfram Alpha needs to be this easy!<\/p>\n<pre>data = Table[{x, CountryData[\"UK\", {\"GDP\", x}]}, {x, 1970, 1980}];\r\nExport[\"GDP.csv\", data]<\/pre>\n<p>So, after some blood sweat and tears I had some actual numerical data but how could I export it to something useful.\u00a0 Wolfram Alpha always returns results as images by default.<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" class=\"aligncenter\" src=\"\/images\/walpha\/big\/walpha_UKGDP_results.png\" alt=\"UK GDP results from Wolfram Alpha\" \/><\/p>\n<p>Which are not particularly useful if you want to do your own analysis.\u00a0 I can also get it as copyable plaintext and for this data set it looks like this<\/p>\n<pre>$123.7 billion per year  (US dollars per year)  |  $139.9 billion per year  (US dollars per year)\r\n|  $160.8 billion per year  (US dollars per year)|  $181.5 billion per year  (US dollars per year)\r\n|  $196 billion per year  (US dollars per year)  |  $234.4 billion per year  (US dollars per year)\r\n|  $225.2 billion per year  (US dollars per year)  |  $254.4 billion per year  (US dollars per year)\r\n|  $322.3 billion per year  (US dollars per year)  | $418.9 billion per year  (US dollars per year)  |\r\n  $537.2 billion per year  (US dollars per year)<\/pre>\n<p>Hmmm. That&#8217;s going to need some pre-processing before I can import it into Excel I think &#8211; a job for a student or a Python script I think.<\/p>\n<p>Now onto the Source information.  It listed it&#8217;s primary sources as &#8216;Wolfram Alpha Curated data 2009&#8217; and &#8216;Wolfram Mathematica CountryData&#8217; with a shed load of Secondary sources such as &#8216;The US CIA WorldFactbook&#8217;.  I have to say that I was a little surprised at this &#8211; how is Wolfram Alpha the Primary source of this data set?  They must have got it from somewhere and THAT somewhere would be the primary source (or closer to it at least) IMHO.<\/p>\n<p>In all honesty, I feel that putting itself as the primary source for data such as this is a bit like a student writing an essay and under &#8216;<strong>references<\/strong>&#8216; simply putting &#8216;<strong>My head<\/strong>&#8216;.<\/p>\n<p>Don&#8217;t get me wrong, I am starting to love Wolfram Alpha and think it&#8217;s got amazing potential but when you love someone you always want to see them do better for themselves.\u00a0 In this particular area I think that Wolfram needs to address the following<\/p>\n<ul>\n<li>Make it easier to get lists of data out of WA.  Being able to parse Table[] might be a good start<\/li>\n<li>Allow export of tabular data in popular formats such as CSV and Excel.<\/li>\n<li>Work on the sources information a little.\u00a0 Wolfram Alpha didn&#8217;t actually generate this GDP data &#8211; they must have got it from somewhere and that should be listed as primary source.<\/li>\n<\/ul>\n<p>Wolfram Alpha is a constantly moving target and it is quite possible that all of these issues will be addressed in no time (if Wolfram agrees that they are issues of course) so feel free to point out if any of the inputs I have linked to give different results from those stated here.\u00a0 I am also aware that I don&#8217;t know everything about this system so if I am being an idiot then feel free to point out how I should have phrased my query.\u00a0 Finally, if any new functionality comes online that makes all of this trivial then I would love to know.<\/p>\n<p>Comments are, as always, welcomed.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Up until now I have been using Wolfram Alpha as the ultimate geek toy and have been truly delighted with it but I thought it was high time I tried to consider how one might use it more seriously.\u00a0 So I set myself a task.\u00a0 Nothing too complicated you understand , after all I am [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[8,33],"tags":[],"class_list":["post-1345","post","type-post","status-publish","format-standard","hentry","category-mathematica","category-wolfram-alpha"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p3swhs-lH","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/1345","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1345"}],"version-history":[{"count":15,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/1345\/revisions"}],"predecessor-version":[{"id":1360,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/1345\/revisions\/1360"}],"wp:attachment":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1345"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1345"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1345"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}