Yesterday was an important date for all the Spaniards. It was a polling day. The first elections after the acceptance of the Global Crisis. Two days before, we had a day called the Reflexion Day, named in that way in order to invite all the voters to think about their decision. However, the common feeling these months has been the indignation of the spanish people across the country and, why not, across the world. Well, as a Spaniard who lives in Canada, what I did in my Reflexion Day was a quick and dirty word frequency and sentiment analysis of the Election Manifestos from some of the most “important” parties in Spain.
To achieve it, I followed a serie of steps, mechanically, and I applied them to the official manifestos. Of course, there were some of them, for example the EAJ-PNV‘s manifesto, that they weren’t in a text format, but in image format. That kind of files weren’t processed. The list of the parties analyzed is IU, PP, PSOE, Esquerra, UPyD, CC, EQUO and GEROABAI.
Once I downloaded all the PDF’s files, some of them really heavy, I used the pandoc tool to extract just the text. This done, I created a little Python script to split all the text into single sentences, join sentences that are between two lines, and clean several things, like numbers of pages or the extra dots. After that, at the same time, the script connects to Sentimen API from ViralHeat to get the positive or negative feeling of every sentence in the political manifesto. With the result in JSON format properly stored in a file, one line per sentence, using another different Python script, I extract just the numbers in a CSV format, in order to be included into a Spreadsheet file and calculate some statistics.
The last part of the analysis was to create a visualization of the data. For this issue, I chose the Nightingale’s Rose from Protovis visualization toolkit, and the Wordle tool to create tag clouds. The result of the first one can be seen below.
Every slice in the diagrama has two areas. One in blue represents the total number of sentences in a positive sentiment. The one in red represents the total number of sentences in a negative sentiment. The length of the manifestos goes from 2,250 sentences from Esquerra’s one, to the much more little 623 sentences from UPyD. In relative terms, we can find the next results.
It seems like UPyD is the more realistic party, because it has the speech with more negative sentences in percentage than the rest (~16%), but it also preserves a good number of positive sentences. On the other side, CC y PP have the more optimistic manifesto, with percentage of positive sentences higher than the 95%. But, what kind of words are more used in their respective manifestos? Let’s see…
This one is the cloud of the current party in the Goverment. But its manifesto seems to center around on the words social, employment (“empleo”), system (“sistema”), politics (“política”), economy (“economía”), equality (“igualdad”) and companies (“empresas”); precisely the topics in which they has notably failed.
The PP is the main party in the opposition and allegedly more right-wing (actually the both have shown the same social politics in the past). Its manifesto is strongly focused in the word change (“cambio”), followed by employment (“empleo”), society (“sociedad”), stability (“estabilidad”), reforms (“reformas”), better (“mejor”), european (“europea”), welfare (“bienestar”), and the future tense for motivate, estimulate or boost (“impulsaremos”). Of course it is a really positive speech. Who wouldn’t vote to them with that kind of happiness and improvements? I won’t be me…
On the other hand, the historically more left-wing party is focused on highlight just the words, again, left-wing (“izquierda”), proposals (“propuestas”), rights (“derecho”), united (“unida”, just because the name es something like United Left-Wing Party), social in many ways (“social”), elections (“electoral”), public in other bunch of forms (“público”, “pública” and so on) and services (“services”). Under my opinion, not a very strong manifesto and maybe a little bit fainthearted.
It looks like our more realistic party has any prominent word. In its place, they focus on comunities (“comunidades”), development (“development”), regions (“autónomas”) and administration (“adminitración”). Perhaps it’s the more heterogeneous manifesto that I have analyzed.
Esquerra is a Catalonian party and I couldn’t find the manifesto in Spanish. Anyway, they seem to be centered around estate (“estat”), people (“persones”), the name of its region, Catalonia (“Catalunya”), social (“social”), action (“acció”) and politics (“política).
Equo is a just created party, founded by the ex-director of Greenpeace Spain and with a strong focus on environment and global warming. That’s why we can find words like “salud” (health), “sostenible” (sustainable) or “desarrollo” (development).
In these last two clouds we can see the name of the party and, more important, the name of the corresponding autonomous regions: Canarias and Navarra. The resto of the words are barely used. Maybe they are trying to win voters in their own regions, because all the manifesto is aorund the names of the regions.
Sadly, the worst came. And what is it about? It’s not about having a hard right-wing party for the next 4 years. It’s about granting a party the power to rule alone in the Goverment, with an absurd absolute mayority and, the most of the times, counterproductive.