Monthly Archives: March 2014

Blurry Fractals

Beauty is the first test; there is no permanent place in the world for ugly mathematics (G. H. Hardy)

Newton basin fractals are the result of iterating Newton’s method to find roots of a polynomial over the complex plane. It maybe sound a bit complicated but is actually quite simple to understand. Those who would like to read some more about Newton basin fractals can visit this page.

This fractals are very easy to generate in R and produce very nice images. Making a small number of iterations, resulting images seems to be blurred when are represented with tile geometry in ggplot. Combined with palettes provided by RColorBrewer give rise to very interesting images. Here you have some examples:

Result for f(z)=z3-1 and palette equal to Set3:Blurry1-Set3Result for f(z)=z4+z-1 and palette equal to Paired:Blurry2-PairedResult for f(z)=z5+z3+z-1 and palette equal to Dark2:Blurry3-Dark2Here you have the code. If you generate nice pictures I will be very grateful if you send them to me:

library(ggplot2)
library(numDeriv)
library(RColorBrewer)
library(gridExtra)
## Polynom: choose only one or try yourself
f  <- function (z) {z^3-1}        #Blurry 1
#f  <- function (z) {z^4+z-1}     #Blurry 2
#f  <- function (z) {z^5+z^3+z-1} #Blurry 3
z <- outer(seq(-2, 2, by = 0.01),1i*seq(-2, 2, by = 0.01),'+')
for (k in 1:5) z <- z-f(z)/matrix(grad(f, z), nrow=nrow(z))
## Supressing texts, titles, ticks, background and legend.
opt <- theme(legend.position="none",
             panel.background = element_blank(),
             axis.ticks=element_blank(), 
             axis.title=element_blank(), 
             axis.text =element_blank())
z <- data.frame(expand.grid(x=seq(ncol(z)), y=seq(nrow(z))), z=as.vector(exp(-Mod(f(z)))))
# Create plots. Choose a palette with display.brewer.all()
p1 <- ggplot(z, aes(x=x, y=y, color=z)) + geom_tile() + scale_colour_gradientn(colours=brewer.pal(8, "Paired")) + opt
p2 <- ggplot(z, aes(x=x, y=y, color=z)) + geom_tile() + scale_colour_gradientn(colours=brewer.pal(7, "Paired")) + opt
p3 <- ggplot(z, aes(x=x, y=y, color=z)) + geom_tile() + scale_colour_gradientn(colours=brewer.pal(6, "Paired")) + opt
p4 <- ggplot(z, aes(x=x, y=y, color=z)) + geom_tile() + scale_colour_gradientn(colours=brewer.pal(5, "Paired")) + opt
# Arrange four plots in a 2x2 grid
grid.arrange(p1, p2, p3, p4, ncol=2)

Random Love

Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin (John von Newman)

Ms. Positive and Mr. Negative live in a one-dimensional world and are falling in love. But beginnings are not always easy. They have a big problem: none of them like the other’s neighborhood. Ms. Positive only wants to walk around Positive Integer Numbers Neighborhood and Mr. Negative around Negative Integers Numbers one. This is a prickly problem they need to deal with as soon as possible. But they have a good idea. They will start their walks from Zero, an impartial place between both neighborhoods and will let fate to guide their feet. They will toss a coin to decide every step: if result is head, they will advance 1 step toward positive numbers neighborhood; if tail, they will advance 1 step toward negative numbers one. For example, if the first 5 tosses are face, face, tail, tail and tail, the their first 5 steps will be +1, +2, +1, 0 and -1. It seems to be a fair agreement for both. Maybe is not the most pleasant way to take a walk but It is well known that lovers use to do silly things constantly, especially at the beginnings. They always walk for two hours, so they toss the coin 7.200 times every walk (these lovers are absolutely crazy as you can see). This was their first walk:

plot1

After this first walk, Mr Negative was really upset. Ms. Positive, watching his face fell, ask him: What’s the matter, honey? and Mr. Negative replied: What’s the matter? What’s the matter? The matter is that we spent almost all the time walking around your horrible neighborhood! What comes next is too hard to be reproduced here. Anyway, they agreed to give a chance to the method they designed. How can one imagine that a coin can produce such a strange walk! There must be an error! After 90 walks, the situation of our lovers was extremely delicate. A 57% of the walks were absolutely awful for one of them since more than 80% of the steps were around the same neighborhood. Another 32% were a bit uncomfortable for one of them since between 60% and 80% of the steps were around the same neighborhood. Only 11% of the walks were gratifying. How is it possible?, said Mr. Negative. How is it possible?, said Ms. Positive.

hist2

But here comes Ms. Positive, who always looks on the brigth side of life: Don’t worry, darling. In fact, we don’t have to be sad. We get angry the same amount of times! For me is enough. What about you?, said her. For me is perfect as well!, said Mr. Negative. In that moment, they realise they were made for each other and started another random walk with a big smile on their faces.

water2

This is the code:

library(ggplot2)
steps   <- 2*60*60 #Number of steps
results <- data.frame()
walks<-90 #Number of walks
for (i in 1:walks)
{
  state <- cumsum(sample(c(-1,1), steps, replace = TRUE))
  results <- rbind(results, c(sum(state<0), sum(state>0), sum(state==0), 
                              if (sum(state<0) >= sum(state>0)) 1 else 0))
}
colnames(results) <- c("neg.steps", "pos.steps", "zero.steps", "ind.neg")
results$max.steps <- apply(results, 1, max)/apply(results, 1, sum)
#Plot of one of these walks
mfar=max(abs(max(state)),abs(min(state)))
plot1 <- qplot(seq_along(state), 
      state, 
      geom="path")+ 
  xlab("Step") + 
  ylab("Location") +
  labs(title = "The First Walk Of Ms. Positive And Mr. Negative")+
  theme(plot.title = element_text(size = 35))+
  theme(axis.title.y = element_text(size = 20))+
  theme(axis.title.x = element_text(size = 20))+
  scale_x_continuous(limits=c(0, length(state)),breaks=c(1,steps/4,steps/2,3*steps/4,steps))+
  scale_y_continuous(limits=c(-mfar, mfar), breaks=c(-mfar,-mfar/2, 0, mfar/2,mfar))+
  geom_hline(yintercept=0)
ggsave(plot1, file="plot1.png", width = 12, height = 10)
#Summary of all walks
hist1 <- ggplot(results, aes(x = max.steps))+
  geom_histogram(colour = "white",breaks=seq(.4,1,by=.2),fill=c("blue", "orange", "red"))+
  theme_bw()+
  labs(title = paste("What Happened After ", toString(walks), " Walks?",sep = ""))+
  scale_y_continuous(breaks=seq(0,(nrow(results[results$max.steps>.8,])+10),by=10))+
  theme(plot.title = element_text(size = 40))+  
  xlab("Maximum Steps In The Same Location (%)") + 
  ylab("Number of Walks")
ggsave(hist1, file="hist1.png", width = 10, height = 8)
#Data for waterfall chart
waterfall <- as.data.frame(cbind(
  c("Total Walks", "Satisfactory Walks", "Uncomfortable Walks", "Awful Walks for Mr. +", "Awful Walks for Ms. -"),
  c("a", "b", "c", "d", "d"),
  c(0, 
    nrow(results),
    nrow(results)-nrow(results[results$max.steps<.6,]),
    nrow(results)-nrow(results[results$max.steps<.6,])-nrow(results[results$max.steps>=.6 & results$max.steps<.8,]),
    nrow(results)-nrow(results[results$max.steps<.6,])-nrow(results[results$max.steps>=.6 & results$max.steps<.8,])-nrow(results[results$max.steps>=.8 & results$ind.neg==1,])
    ),
  c(nrow(results),
    nrow(results)-nrow(results[results$max.steps<.6,]),
    nrow(results)-nrow(results[results$max.steps<.6,])-nrow(results[results$max.steps>=.6 & results$max.steps<.8,]),
    nrow(results)-nrow(results[results$max.steps<.6,])-nrow(results[results$max.steps>=.6 & results$max.steps<.8,])-nrow(results[results$max.steps>=.8 & results$ind.neg==1,]),
    0
    ),
  c(nrow(results), 
    nrow(results[results$max.steps<.6,]), 
    nrow(results[results$max.steps>=.6 & results$max.steps<.8,]), 
    nrow(results[results$max.steps>=.8 & results$ind.neg==1,]), 
    nrow(results[results$max.steps>=.8 & results$ind.neg==0,]))
))
colnames(waterfall) <-c("desc", "type", "start", "end", "amount")
waterfall$id <- seq_along(waterfall$amount)
waterfall$desc <- factor(waterfall$desc, levels = waterfall$desc)
#Waterfall chart
water1 <- ggplot(waterfall, aes(desc, fill = type)) + 
  geom_rect(aes(x = desc, xmin = id-0.45, xmax = id+0.45, ymin = end, ymax = start))+ 
  xlab("Kind of Walk") + 
  ylab("Number of Walks") +
  labs(title = "The Ultimate Proof (After 90 Walks)")+
  theme(plot.title = element_text(size = 35))+
  theme(axis.title.y = element_text(size = 20))+
  theme(axis.title.x = element_text(size = 20))+
  theme(legend.position = "none")

The Lonely Acacia Is Rocked By The Wind Of The African Night

If you can walk you can dance. If you can talk you can sing (Zimbabwe Proverb)

There are two things in this picture I would like to emphasise. First one is that everything is made using points and lines. The moon is an enormous point, stars are three small nested points and the tree is a set of straight lines. Points and lines over a simple cartesian graph, no more. Second one is that the tree is a jittered fractal. In particular, is a jittered L-system fractal, a formalism invented in 1968 by a biologist (Aristid Lindemayer) that yields a mathematical description of plan growth. Why jittered? Because I add some positive noise to the angle in which branches are divided by two iteratively. It gives to the tree the sense to be rocked by the wind. This is the picture:

image1

I generated 120 images and gathered in this video to make the wind happen. The stunning song is called Kothbiro performed by Ayub Ogada.

Here you have the code:

depth <- 9
angle<-30 #Between branches division
L <- 0.90 #Decreasing rate of branches by depth
nstars <- 300 #Number of stars to draw
mstars <- matrix(runif(2*nstars), ncol=2)
branches <- rbind(c(1,0,0,abs(jitter(0)),1,jitter(5, amount = 5)), data.frame())
colnames(branches) <- c("depth", "x1", "y1", "x2", "y2", "inertia")
for(i in 1:depth)
{
  df <- branches[branches$depth==i,]
  for(j in 1:nrow(df))
  {
    branches <- rbind(branches, c(df[j,1]+1, df[j,4], df[j,5], df[j,4]+L^(2*i+1)*sin(pi*(df[j,6]+angle)/180), df[j,5]+L^(2*i+1)*cos(pi*(df[j,6]+angle)/180), df[j,6]+angle+jitter(10, amount = 8)))
    branches <- rbind(branches, c(df[j,1]+1, df[j,4], df[j,5], df[j,4]+L^(2*i+1)*sin(pi*(df[j,6]-angle)/180), df[j,5]+L^(2*i+1)*cos(pi*(df[j,6]-angle)/180), df[j,6]-angle+jitter(10, amount = 8)))
  }
}
nodes <- rbind(as.matrix(branches[,2:3]), as.matrix(branches[,4:5]))
png("image.png", width = 1200, height = 600)
plot.new()
par(mai = rep(0, 4), bg = "gray12")
plot(nodes, type="n", xlim=c(-7, 3), ylim=c(0, 5))
for (i in 1:nrow(mstars)) 
{
  points(x=10*mstars[i,1]-7, y=5*mstars[i,2], col = "blue4", cex=.7, pch=16)
  points(x=10*mstars[i,1]-7, y=5*mstars[i,2], col = "blue",  cex=.3, pch=16)
  points(x=10*mstars[i,1]-7, y=5*mstars[i,2], col = "white", cex=.1, pch=16)
}
# The moon
points(x=-5, y=3.5, cex=40, pch=16, col="lightyellow")
# The tree
for (i in 1:nrow(branches)) {lines(x=branches[i,c(2,4)], y=branches[i,c(3,5)], col = paste("gray", as.character(sample(seq(from=50, to=round(50+5*branches[i,1]), by=1), 1)), sep = ""), lwd=(65/(1+3*branches[i,1])))}
rm(branches)
dev.off()

Dora’s Choice

Arithmetic is being able to count up to twenty without taking off your shoes (Mickey Mouse)

Dora-1

On her last mission, Dora The Explorer sails down the Amazon river to save her friend Isa The Iguana from Swiper The Fox claws. After some hours of navigation, Dora sees how the river divides into 3 branches and has to choose which one to follow. Before leaving, her friend Map told her that just one of these branches is safe. Two others end in terrible waterfalls, both impossible to escape alive. Although Dora does not know which one is the good one, she decides to take the branch number 1. Suddenly, her friend Boots The Monkey yells from the top of a palm tree:

Dora, do not take branch number 3! I can see from here that it ends in a horrible waterfall!

After listening to Boots, Dora changes her mind and decides to take branch number 2. Why Dora switches? Because she knows that this change has significantly increased her probability of ending the mission alive.

There are several ways to convince yourself of this. One is to simulate the situation that has faced Dora and compare results of switching and not switching . Switching, Dora saves her life 2 of each 3 simulations while if she does not, Dora only saves 1 of each 3 simulations. Changing her mind, Dora doubles her chances of survival!

Carefully considering what happens, you can see that switching Dora saves herself when her first choice is erroneus, which occurs with probability 2/3. On the other hand, if Dora remains faithful to her first choice, obviously only saves herself with probability 1/3.

This is an example on my own of the famous Monty Hall Problem. You can see a nice explanation of it in a chapter of Numb3rs or in the film 21 Black Jack. Not long ago I exposed the problem in a family meeting. Only my mum said she would switch (we were 6 people in the meeting). It is fun to share this experiment and ask what people would do. Do it with your friends and family. First time I knew the problem I thought there were no difference between switching and not since I gave both possibilities 1/2 of probability. If I had been Dora, pretty sure I would tumbled over a terrible waterfall. What about yo?

Note: this is an update of the post, which was not a correct formulation of Monty Hall Problem. Thanks to David Robinson and Scott Kostyshak for showing me my error. A correct formulation of the problem may be this:

On her last mission, Dora The Explorer sails down the Amazon river to meet her cousin Diego. After some hours of navigation, Dora sees how the river divides into 3 branches and has to choose which one to follow. Before leaving, her friend Map told her that just one of these branches is safe. Two others end in terrible waterfalls, both impossible to escape alive. Although Dora does not know which one is the good one, she decides to take the branch number 1. After putting the bow towards branch number one, Dora sees Swiper The Fox smiling from the shore, in a high place where obviously can see the end of all three branches. Dora yells him:

– Help me Swiper! Which one should I take?

Swiper replies:

– I am the villain of this story so I will give you only an advice: do not take branch number 3. It ends into a terrible waterfall.

Dora, who has a sixth sense to notice when Swiper is lying, knows he is telling the truth and immediately changes her mind and decides to take branch number 2. Why Dora switches? Because she knows that this change has significantly increased her probability of ending the mission alive.

doras_choice

Here you have the code:

library(ggplot2)
library(extrafont)
nchoices <- 3
nsims <- 500
choices <- seq(from=1, to=nchoices, by=1)
good.choice <- sample(choices, nsims, replace=TRUE)
choice1 <- sample(choices, nsims, replace=TRUE)
dfsims <- as.data.frame(cbind(good.choice, choice1))
dfsims$advice <- apply(dfsims, 1, function(x) choices[!choices %in% as.vector(x)][sample(1:length(choices[!choices %in% as.vector(x)]), 1)])
dfsims$choice2 <- apply(dfsims, 1, function(x) choices[!choices %in% as.vector(c(x[2], x[3]))][sample(1:length(choices[!choices %in% as.vector(c(x[2], x[3]))]), 1)])
dfsims$win1 <- apply(dfsims, 1, function(x) (x[1]==x[2])*1)
dfsims$win2 <- apply(dfsims, 1, function(x) (x[1]==x[4])*1)
dfsims$csumwin1 <- cumsum(dfsims$win1)/as.numeric(rownames(dfsims))
dfsims$csumwin2 <- cumsum(dfsims$win2)/as.numeric(rownames(dfsims))
dfsims$nsims <- as.numeric(rownames(dfsims))
dfsims$xaxis <- 0
### XKCD theme
theme_xkcd <- theme(
panel.background = element_rect(fill="darkolivegreen1"),
panel.border = element_rect(colour="black", fill=NA),
axis.line = element_line(size = 0.5, colour = "black"),
axis.ticks = element_line(colour="black"),
panel.grid = element_line(colour="white", linetype = 2),
axis.text.y = element_text(colour="black"),
axis.text.x = element_text(colour="black"),
text = element_text(size=18, family="Humor Sans"),
plot.title = element_text(size = 50)
)
### Plot the chart
p <- ggplot(data=dfsims, aes(x=nsims, y=csumwin1))+
geom_line(aes(y=csumwin2), colour="green4", size=1.5, fill=NA)+
geom_line(colour="green4", size=1.5, fill=NA)+
geom_text(data=dfsims[400, ], family="Humor Sans", aes(x=nsims), colour="green4", y=0.7, label="if Dora switches ...", size=5.5, adjust=1)+
geom_text(data=dfsims[400, ], family="Humor Sans", aes(x=nsims), colour="green4", y=0.3, label="if Dora does not switch ...", size=5.5, adjust=1)+
coord_cartesian(ylim=c(0, 1), xlim=c(1, nsims))+
scale_y_continuous(breaks = c(0,round(1/3, digits = 2),round(2/3, digits = 2),1), minor_breaks = c(round(1/3, digits = 2),round(2/3, digits = 2)))+
scale_x_continuous(minor_breaks = seq(100, 400, 100))+
labs(x="Number Of Simulations", y="Rate Of Survival", title="Dora's Choice")+
theme_xkcd
ggsave("doras_choice.jpg", plot=p, width=8, height=5)

Warholing Grace With Clara

Do not believe anything: what artists really do is to hang around all day (Paco de Lucia)

Andy Warhol was mathematician. At least, he knew how clustering algorithms work. I am pretty sure of this after doing this experiment.  First of all, let me introduce you to the breathtaking Grace Kelly:

936full-grace-kelly

In my previous post I worked also with images showing how simple is to operate with them since they are represented by matrices. This is another example of this. Third dimension of an image matrix is an 3D array representing color of pixels in (r, g, b) format. Applying a cluster algorithm over this information generates groups of pixels with similar color. I used cluster package and because of the high size of picture I decided to use clara algorithm which is extremely fast. Apart of its high speed, another advantage of clara is that clusters are represented by real elements of the population, called medoids, instead of being by average individuals as k-means do. It fits very well with my purposes because once clusters are calculated I only have to change each pixel by its medoid and plot it. Setting clara to divide pixels into 2 groups, generates a 2 colored image. Setting it to 3 groups, generates a 3 colored one and so on. Following, you can find results from 2 to 7 groups:

Warholing2 Warholing3 Warholing4
Warholing5 Warholing6 Warholing7

Working with samples can be a handicap, maybe less important than the speed it produces. Sometimes images generated by n groups seems to be worse fitted than the one generated by n-1 groups. You can see it in this video, where results from 1 to 60 groups are presented sequentially. It only takes 42 seconds.

Here you have the code. Feel free to warholing:

library("biOps")
library("abind")
library("reshape")
library("reshape2")
library("cluster")
library("sp")
#######################################################################################################
#Initialization
#######################################################################################################
x     <- readJpeg("936full-grace-kelly.jpg")
plot(x)
#######################################################################################################
#Data
#######################################################################################################
data <- merge(merge(melt(x[,,1]), melt(x[,,2]), by=c("X1", "X2")), melt(x[,,3]), by=c("X1", "X2"))
colnames(data) <- c("X1", "X2", "r", "g", "b")
#######################################################################################################
#Clustering
#######################################################################################################
colors <- 5
clarax <- clara(data[,3:5], colors)
datacl   <- data.frame(data, clarax$cluster)
claradf2 <- as.data.frame(clarax$medoids)
claradf2$id <- as.numeric(rownames(claradf2))
colnames(claradf2) <- c("r", "g", "b", "clarax.cluster")
claradf <- merge(datacl, claradf2, by=c("clarax.cluster"))
colnames(claradf) <-c("clarax.cluster", "X1", "X2", "r.x", "g.x", "b.x", "r.y", "g.y", "b.y")
datac <- claradf[do.call("order", claradf[c("X1", "X2")]), ]
x1<-acast(datac[,c(2,3,7)], X1~X2, value.var="r.y") 
x2<-acast(datac[,c(2,3,8)], X1~X2, value.var="g.y") 
x3<-acast(datac[,c(2,3,9)], X1~X2, value.var="b.y") 
warhol <- do.call(abind, c(list(x1,x2,x3), along = 3))
plot(imagedata(warhol))
writeJpeg(paste("Warholing", as.character(colors), ".jpg", sep=""), imagedata(warhol))