Category Archives: Milestones

Bye Ripples, Hi Fronkonstin

Nobody knows how to say goodbye, It seems so easy until you try (Nobody Knows, The Lumineers)

I have been user of WordPress.com for (almost) three years and it has been a very good experience. But I want to take more control of my blog: install plugins, manage ads (someone interested out there to publicize in my blog?) and place interactive charts in my posts instead static screenshots. This is why I opened a new blog called Fronkonstin (I love the name, no other reason) where I will continue publishing. I will maintain both blogs during some time (although Ripples will not be updated anymore). If you follow Ripples, please start following Fronkonstin. I promise I will try to do my best to continue making interesting experiments in this new era.

Thanks.

 

Advertisements

First Anniversary Of Ripples

If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is (John von Neumann)

I started this blog one year ago and the experience has been better than I could have imagine:

  • Ripples has been viewed about 30.000 times
  • 49 posts published (The Sound Of Mandelbrot Set was the most-viewed)
  • R-bloggers is my top referring site (thanks to Tal Galili for his support)
  • Visitors come from 141 countries: most of them came from The United States. Spain and U.K. were not far behind (below you can see my map of the empire)

I have met a lot of wonderful people along the way like Andrew Wyer with whom I wrote a post about 3D-Harmonographs, people of Merkbar,  a musicians who asked me a plot for the cover of their new album (I will post about this project in the next days) and Gregory Piatetsky from KDnuggets (I expect also doing something with him in the close future).

I think this new year is going to be funny. I still have lot of experiments in my mind that I want to explore with R.

Happy 2015!

 

WORLD

PageRank For SQL Lovers

If you’re changing the world, you’re working on important things. You’re excited to get up in the morning (Larry Page, CEO and Co-Founder of Google)

This is my particular tribute to one of the most important, influential and life-changer R packages I have discovered in the last times: sqldf package.

Because of my job, transforming data through SQL queries is very natural for me. This, together with the power of R made this package indispensable for me since I knew of its existence.

Imagine you have a directed graph like this:PR1

Given a vertex V, these are the steps to calculate its PageRank, lets call it PR(V):

  • Initialize PR(V) to some value (I do it to 1 in my script)
  • Iterate this formula until converges: PR(V)=(1-d)+d*(PR(T1)/C(T1)+ ... +PR(Tn)/C(Tn)) where Ti are the vertex that point to V and C(Ti) is the number of edges going out of Ti

After doing this, result is:

PR2

Following you can find my code to do it with sqldf, which is quite simple from my point of view. I am pretty sure there must be some package which calculates PageRank but the main goal of this post is to show how easy is to calculate it with two simple queries, no more. The example is taken from here, where you can find a good explanation of how PageRank works:

require(sqldf)
require(igraph)
net=data.frame(origin=c("A","A","B","C","D"), end=c("C","B","C","A","C"))
par(family="serif", cex=1, ps=25, bg="white", col.lab="black", col.axis="black")
plot(graph.edgelist(as.matrix(net)), edge.arrow.size=1, vertex.color="gray90", edge.color="black")
#Initialization
netou=sqldf("SELECT origin, COUNT(*) outs FROM net GROUP BY 1")
netpr=sqldf("SELECT origin vertex, 1.0 pagerank FROM net UNION SELECT end, 1.0 FROM net")
for (i in 1:50)
{
netx1=sqldf("SELECT vertex, pagerank/outs factor FROM netou a INNER JOIN netpr b ON (a.origin = b.vertex)")
netpr=sqldf("SELECT a.vertex, 0.15+SUM(0.85*COALESCE(factor,0)) AS pagerank
FROM netpr a LEFT OUTER JOIN net b ON (a.vertex = b.end) LEFT OUTER JOIN netx1 c
ON (b.origin = c.vertex) GROUP BY 1")
}
g=graph.edgelist(as.matrix(net))
names=data.frame(vertex=V(g)$name)
V(g)$name=sqldf("SELECT a.vertex||' (PR='||ROUND(b.pagerank,2)||')' as name from names a inner join netpr b ON (a.vertex=b.vertex)")$name
plot(g, edge.arrow.size=1, vertex.color="gray90", edge.color="black")

Summer Summary

The universe is full of magical things patiently waiting for our wits to grow sharper (Eden Phillpots)

I launched this blog 7 months ago and published 30 posts during this time. These are some of my figures until now:

My favourite post? I don’t really know, but I am very proud of this one and this one. I have received more positive critics than negative ones and the future sounds good: I have lots of experiments in my head to try with R.

Thanks a lot.

map