Monthly Archives: November 2012

What if I decide to teach a MOOC? Well, then I should learn some Python :)

Well, I think that today everybody knows what is MOOC. MOOC stands for Massive Open Online Course. You may heard about the term first from Stanford University, and then by Udacity, Coursera, edX, or TEDEd. So there is so much hype about the concept and the idea of MOOC’s, although it is not as new as we can think. Open Learning Initiative could be one of the first exploring this trend, or more recently, P2P University and Khan Academy. However, when you decide to teach your content following the MOOC model, there are some steps to overcome. First question is if you need your own platform or just use one of the available. If the answer to this question is something like “what are you talking about”, then you can pur yout content in sites like Udemy , CurseSite by BlackBoard or iTunesU, and forget about systems administration, users registration, machine requirements, bandwidth, etc. But you will be tied to a company and its constraints. Or, if you are part of a bigger institution, you can beg your boss to join to one of the biggest consortium mentioned above. Let me tell you something, this is not going to happen quickly (or at all, the wheels of bureaucracy turn slowly), so better get a new approach. On the other hand, if you have a passable server with acceptable bandwidth, some tech guys with free time (what it is an oxymoron), and a lot of energy and passion, you can also setup your own infrastructure. If this is your case, what are your options? Well, just a few that I will enumerate.

  • OpenMOOC, aims to be an open source platform (Apache license 2.0) that implements a fully open MOOC solution. It fully video-centered following the steps of the first experiment of Standford AI-Class. It is a new approach but makes harder to add traditional questions no based on videos or even the send of essays. Is prepared to be used with an IdP in order to have an identity federation for big sites. It is able to process automatically YouTube videos and extract the last frame as the question if required. Because we particularly don’t need the federation, we removed that feature and added some more in our own fork, just to try the solution. Also is able to connect to AskBot for a forum-like space for questions and answer. Successfully deployed in UNED COMA.
  • Class2Go, easier to install and have it running but kind of complex to manage. It integrates very well with services such Amazon SES (that we added to our OpenMOOC fork), Piazza, the Khan Academy HTML-based exercise framework , and Amazon AWS. Used by Standford.
  • Course Builder, pretty beauty but hard to deploy or add content. Used by Google and some of its free courses.
  • Learnata is with no doubt the best documented and easiest in install. It is the underlying system of the P2PU and it counts with an active and real community behind. It has an awesome badges system, a detailed dashboard, and API, and a bunch of modules (formerly Django applications). But doesn’t manage videos as well as the other two.

All of them are built using Python and, except Course Builder, Django as core technology. It just so happens that here at CulturePlex Lab we use Python and Django a lot. That’s why we are currently forking everywhere and creating our own MOOC system. And that’s the magic of Open Source: we can fork OpenMOOC, take some features of Class2Go and another ones from Learnata and, whenever we respect the licenses, release a new MOOC system, the CulturePlex Courses (still under hard testing).

Next post? Some notes about what you need, in physical terms, like a camera, a monopods, a tablet, etc.


Filed under Analysis

The experience of the PyCon Canada 2012 #PyConCa

Well, so finally the date arrived and I had to go to Toronto for giving a talk about Graph Databases in Python. Since the beginning  of the event I could feel the energy and good vibrations of the wonderful team of organizers. From here, my humble congratulations for all of them for an awesome job, including volunteers, that made real an amazing experience. It was my first PyCon so far. I had already heard about Python Conferences and how cool are, but I never had the opportunity to be in one. PyConCa, the first one of its kind Canada-wide, gave me the chance I was wanted to.

The reception took place the Friday at night, where I could know some people, register as speaker and get the credential. I must say that the credential with the shiny tag of speaker made me very happy.

Saturday was the first formal day of conference, starting relatively early (above all for those who went out the night before). The session began with a keynote by Jessica McKellar about HackerSchool. After a small break, the sessions split up in three, main hall, lower hall and tutorial room. Unfortunately, I couldn’t attend to any of the tutorials. So for the next talk I had to decide between the 40 min talk about SQLAlchemy (given by its creator Michael Bayer) and the two 20 min talks about MongoDB and Gene databases and about Writing self-documenting scientific code using physical quantities. So I went to the SQLAlchemy talk for 20 min and then to the one about  MongoDB. In the last one I knew Vid Ayer who shown herself really interested on graph databases and she didn’t miss my talk.

After another small break, I saw a really good talk by Mike Fletcher, an independent consultant from Toronto. He gave a presentation about Profiling for Performance, in which I discovered awesome tools like Coldshot, a better alternative for hotshot, or RunSnakeRun, a graphical interface for profiling logs that is really helpful.

The lunch, that was included in the prize of the conference, was acceptable and did the trick to deceive the stomach until the dinner. Next talks were about App Engine Python SDKCloudant REST API that has been rewritten in Python using Flask, a light web framework; a Python Dynamo-DB mapper; and a funny presentation of everything you wanted to know about deploying web apps on Windows but were too horrified to ask. After the coffe break, Daniel Lindsley, author of tastypie and Haystack amon others, gave an excellent talk about searchers.

After the excellent Daniel’s presentation, I stayed at Main Hall and listened the talk I Wish I Knew How to Quit You: Secrets to sustainable Python communities by Elizabeth Leddy, a core developer of Plone, the previously famous CMS for Python. She is, as we say in Spanish, toda una personaja, and talked about how to successfully manage a Python community. Next talk was by Mahdi Yusuf, a pasionate developer from Ottawa, and maintainer of PyCoders Weekly newsletter, who explained the history of Python Packaging. Right after his successful talk, Martín Alderete from the Python Argentinian users group presented the Ninja IDE, an totally awesome integrated development environment with a ton of features, open source and thought with Python in mind. And that was all for the first day.

In the next morning, because my talk was in the first slot, I missed the keynote given by Michael Feathers. But I was half nervous half excited. So I went to the Lower Hall where my talk will take place an waited for the guy before me to end. Steve Singer, that is his name, talked about using Python as a procedural language for PostgreSQL, what is really interesting. And finally I gave my talk about graph databases, mostly Neo4j, in Python. I presented basic concepts like what is a graph, or what type of graphs exist. And then a landscape of graph databases solutions and which ones of these are suitable to be used in Python. Finally some examples using these libraries and even a fast hint about how to deploy a Neo4j in Heroku and connect to it from neo4j-rest-client. Unfortunately, at the same time, Kenneth Reith, author of requests and working on Heroku, was giving a really interesting talk called Python for Humans.

Nevertheless, I enjoyed the experience of giving a talk, even though there weren’t a lot of people attending to it. One thing I noticed was the most of the people were scientist looking forward to solve a problem that could be better understood using graph structures, like state machines or even biology.

And, why not, here is my talk and my slides, even when I know that the only thing worse than listening to yourself speaking in a video, that’s doubtless listening to yourself speaking in a video in English. But, it is what it is :)

After that, I stayed in the same room to attend a talk on server log analysis using Pandas. Pandas aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. And the guy who talked about it, used IPython Notebook to do the presentation, a new unknown for me killer feature of the interactive console IPython. Then, some talks about big data using Disco and Inferno, and horizontally scaling databases in Django, guys from Chango and Wave Accounting, both Toronto-based companies. Diego Muñoz, ex-member of the CulturePlex, gave a talk about an Ember.js adaptar for Django in order to avoid any change to your REST API if you are already using tastypie.

A small coffe break after, talks about real time web apps with by Gabriel Grant, Urwid by Ian Ward (including the awesome bpython interpreter or speedometer tool), workloads and cloud by Chayin Kirshen, and speeding up your database by Anna Filina. But in the afternoon, I must say that the best one was given by Alexandre Bourget from Monteral aka the showman: Gevent-socketio, cross-framework real-time web live demo.

For ending the day and the conference, Fernando Pérez gave a talk on science and Python as a retrospective of a (mostly) successful decade. In his slides you can clearly see an use of IPython Notebook.

And that was pretty much everything. I already am looking forward for the next one. It is a really good experience, you learn a lot from great people and pass an amazing weekend surrounded by other Python coders. It is totally worth it. Let’s see if I am accepted for PyCon US. Cross your fingers!

1 Comment

Filed under Events

More ideas on the Virtual Cultural Laboratory

These days, after reading the article about the VCL, A Virtual Laboratory for the Study of History and Cultural Dynamics (Juan Luis Suárez and Fernando Sancho, 2011) for our first session of the incipient reading group in the lab, some ideas came to my mind. The article presents a tool in order to help researchers to model and analyze historical processes and cultural dynamics.

The tool defines a model with messages, agents, and behaviours. Very briefly, a message is the most basic unit of information that can be stored or exchanged. There are three types of agents: individuals, mobile members of a social group that can interchange messages among them or adquire a new one; repositories, like individuals but fixed in space; and cultural items, as a way to store an unmutable message to transfer, also immobile. Finally, we find four ways in which agents can behave: reception, memory, learning and emission. Every kind of agent has a different set of behaviours. Cultural items do not receive information and always emit the same message; repositories work as a limited collection of messages, when the repository is full, a message is selected for elimination. And individuals can be Creative, Dominant and Passive, according to the levels they show of attentionality and engagement with the messages. These three simple models provided make the VCL a really versatile cultural simulator. However, as the authors say in the article, VCL is a beta version and could be improved a bit.

I am lucky enough to be able to talk to the authors, and we are having a really interesting discussion about new ways to expand the VCL. On my side, I have been quite influenced by the book La evolución de la cultura (Luigi Luca Cavalli-Sforza, 2007) and the already mentioned before Maps of Time (David Christian, 2011), in such a way that demography and concepts networks have become a very significant factors from my point of view.

The idea is to use graphs to represenet and store the culture of the individuals, and also graphs to represent the different cultures, trying to shift everything a bit to the domain of Graph Theory. We will be able to store the whole universe of concepts  defined through semantic relationships among them. In this scenario, we can figure out a degree pruning to get the diferent connected componentes that represent the cultures, but keeping always the source graph. This prune function could be a measure over the relationships, like for example ‘remove relationships between nodes with this value of betweeness centrality’, or even a randomly way to get connected components. But better if the removed relationships have a sense in terms of semantic.

After we have different graph cultures, we put them all in different places. Then we can get culture sub-graphs and store them in the individiuals in order to give them a cultural feeling of membershipto a certain culture. Sub-graphs form the same culture could overlap each other, but sub-graphs from different cultures should be disjointed. Now, individuals start to move across the world. Also I would introduce the notion of innovations for culture sub-graphs: an innovation is a deciduous concept with no relationships to any concept of the sub-graph, but at least one relationship if we consider the set of relationships of the original graph. Somehow, this implies that everything is already in the world, but it is an interesting assumption to experiment with. Maybe the original graph could be dynamic and get new concepts across time.

So, individuals could show specific behaviours with regard to innovations: Conservative, Conformist and Liberal. And another property to draw the feeling to belong to a group, distinct to the one the individual was born. This value is kind of similar to the permeability to ideas, but different, while permeability works during the whole life of the individual, the membership feeling could operate until it is satisfied, so we can use it as a way to stop individuals, or to define the equilibrium.

Well, these are just ideas. Another approach could be to use population pyramids as inputs for the simulation. Yes, it’s me and demography again. If we do this, given a culture and a number of individuals that changes across time thanks to the population pyramid, we could see, and this is the point, how concepts move through cultures, and even more important, what is the culture of the individuals when the simulation stops. Calculating this is as easy as checking what sub-graphs are a sub-set of the existing cultures. This idea of using a populational pyramid seems interesting to me because allows to analyze the importance of the lost of permeability of the indivoduals to innovations. Therefore, we could find what the elements are of the vertical cultural transmission, traditional, familiar, and ritual; in opposition to the horizonatal transmission (does not imply kinship but relations between individuals).

And one more idea! This one the craziest, I think. We could use a biology-inspired model for the concepts, so a concept would be defined by a vector that quantifies it using previously established knowledge fields. For instance, let’s say that an idea, i, is formed by a 20% of Literature, a 20% of Physics, and a 0% of Biology, so the resulting vector will be i = [20, 20, 0]. Also, ideas are related to each other through a graph. Following this biological analogy, we could set the vector to have 23 pairs of values, in such a way that allows individuals adopt new ideas and modify them according to random changes in the last pair of values… or maybe this is too much craziness. Let’s see!

Leave a Comment

Filed under Analysis