Emilian Bold's blog

Broadcasting from the Romanian trenches

Tuesday, March 03, 2009

Matrix thoughts

AI lemma (anti-Matrix):
We do not live in a simulation of a similar universe as a simulation will consume more energy than a real existence.

A corollary would be:
The universe simulating our existence might be so different that the above lemma doesn't apply.

Monday, February 02, 2009

iPhone Location Manager taking forever

On the iPhone, the Location manager that provides the GPS location is a nice API to use.

It does have some issues though: CLLocationManger doesn't work if it's called from another thread !

I first noticed something was really funny when my delegate wasn't being called at all.

Neither – locationManager:didUpdateToLocation:fromLocation: nor – locationManager:didFailWithError: was called and my application was just waiting there forever for some GPS information.

My first thought was that it was some issue with my memory management as I wasn't holding a reference to the location manager in any class, just in the method where it was created. But still, it didn't work.

Then, I though it was a problem about the threading model being used (I waited for the GPS location in another thread in order not to block the GUI). Sure enough, that seemed to be the problem, and at least another person complained about it. Not sure it is a matter of threading or a matter of memory pool being used.

But more to the point, always create your CLLocationManager instance in the main thread, and not in another thread. Having a singleton method there which is called from the main thread somewhere assures you that the location manager is created in the proper thread/pool.

Friday, January 30, 2009

Developer surprise on OSX

I had a strange bug in the OSX Address Book application: I had a rule that included all the address cards not present in any other rule.

This worked initially, but after an update, Address Book got confused and entered into an infinite cycle (it was probably trying to ignore the cards in the rule itself and then went on to resolve that recursively).

Anyhow, the good thing was the application crashed only if I scrolled on top of that particular rule. And since I had quite a lot, I was safe to open the application at least.

But, still, having a semi-buggy application isn't fun to use. So I went and looked at the Address Book file format which seemed to be some sqlite3 database, but I couldn't fix the problem from there.

To my surprise Apple has a public API for the Address Book !

So I wrote these short lines of code:
    ABAddressBook *AB = [ABAddressBook sharedAddressBook];

NSArray * groups = [AB groups];

for(int i=0;i<[groups count];i++){
ABGroup * group = [groups objectAtIndex:i];
NSString * name = [group valueForProperty: kABGroupNameProperty];

if([@"BadBadRule" compare:name]==NSOrderedSame){
[AB removeRecord:group];
[AB save];
}
}

and that was it ! No more Address Book crashes ! Turns out OSX is really nice to tweak if you are willing to code a bit.

Wednesday, January 14, 2009

My Slicehost / VPS analysis

Fist time VPS user



Starting a few months back, I have a VPS from Slicehost. It's the cheapes one they've got, with only 256MB RAM.

I never worked on a VPS, I only had either dedicated physical servers in the company datacenter (at the previous job) or CPanel-based hosted accounts (for some other clients).

All in all, a VPS is just as one might expect: almost like a normal server only slower.

And the slowness is starting to bug me a bit, specifically the problem that I don't know how slow is it supposed to be.

The fixed technical details from Slicehost is that you'll have 256MB RAM, 10GB or disk storage and 100GB bandwidth.

Now there are 2 issues here. One which seems quite obvious and another one I'll introduce later.

CPU



OK, the 1st problem is that you don't know how much CPU cycles you are going to get. Being a VPS means it runs on some beefy server (Slicehost says it's a Quad core server with 16GB of RAM).

According to Slicehost's FAQ:

Each Slice is assigned a fixed weight based on the memory size (256, 512 and 1024 megabytes). So a 1024 has 4x the cycles as a 256 under load. However, if there are free cycles on a machine, all Slices can consume CPU time.


This basically means that under load, each slices gets CPU cycles depending on the RAM it has (ie. price you pay). A 256MB slice gets 1 cycle, the 512MB slice gets 2 cycles, 1GB slice gets 4 cycles and so on.

The problem here is of course, that one is not certain that they only have on the server a maximum amount of slices, but Slicehost is clearly overselling as top usually displays a "Steal time" of around 20%.

So, assuming a machine is filled 100% with slices and there is no multiplexing, it means that a 256MB slice gets 6.25% of a single CPU under load.

6.25 isn't much at all, but considering that the machine isn't always under load, the slice seems to get a decent amount of CPU nonetheless.

If we consider the overselling issue and that 20% is stolen by Xen to give to other VPS, we get to an even 5 %.

Now, this might not be as bad as it sounds CPU-wise as I've noticed Xen stealing time when my CPU-share is basically idle anyhow so maybe it doesn't affect my overall performance.

For example: ./pi_css5 1048576 takes about 10 seconds which is more than decent.

IO



The bigger problem with VPS seems the be the fact that hard drives aren't nearly as fast as RAM. And when you have a lot of processes competing for the same disk, it's bound to be slow.

What Slicehost doesn't mention is if the "fixed weight" sharing rule they use for CPU cycles applies to disk access too. My impression is that it is.

After trying to use my VPS as a build server I've noticed it grind to a halt.

top shows something like this:


Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 62.2%id, 20.9%wa, 0.0%hi, 0.0%si, 16.9%st


but the load average for a small build is something like


load average: 1.73, 2.06, 1.93
and it easily goes to 3, 4 and even 9! when I also try to do something else there.

So, looking at the information above, we can note that 62.2%, the CPU is just idle, while the actualy "working" tasks, ie. 20.9% are waiting for IO. The rest of 16.9% CPU time is stolen by Xen and given to other virtual machines, and I don't think it really matters given that the load is clearly IO-bound.

And here lies the problem: just how fast might Slicehosts' hard drives be ? And how many per slice ? Actually more like: how many slices per drive ?

From a simple test I made, a simple build that takes 30 seconds on my MacBook Pro (2.4Ghz/2GB ram/laptop hard drive-5400rpm) takes about 20 minutes on the slice. This means the VPS is 40 times slower when doing IO-bound tasks.

Another large build that takes around 40 minutes on my laptop took 28 hours on the server. Which respects the about 40 times slower rule.

Now considering the above number and a 20% steal time, I'd expect to have a 20% overselling of slices on a physical machine. Meaning, at 16GB per machine, roughly 76 slices of 256MB on one machines. Taking into account the 1:40 rule above for IO speed, this means that they have about 2 hard drives in a server.

Conclusions



It's certainly liberating to have complete control over a server. CPanel solution just don't cut it when you need to run various applications on strange ports. Of course, the downsize is that you also have to do all the administration tasks, secure it, etc.

The Slice host services are very decent price-wise, the "administrator" panel they have provides you with everything you need, even a virtual terminal that goes to tty1 of the machine (very handy if for some reason SSH doesn't work for example).

Even the smallest slice I'm using right now has enough space, RAM and bandwidth for small tasks. If you just use it sparingly during business hours, the "fixed weight" sharing rule gives you enough CPU / IO for most tasks.

But for heavy usage, I think the solution is either to get a more expensive slice or start building your own machine.

IO-bound tasks are almost impossible to run due to the 1:40 slowness noticed. This means that you need to get at least the 4GB slice to have it run decently. Of course, that's $250 compared to the $20 slice I have right now.

CPU doesn't seem to be a problem, at least for my kind of usage. It seems responsive enough during normal load and mostly idle under heavy load (so idle that Xen gives my CPU cycles to other virtual machines). Initially I was expecting this to be a major problem while moving my build server there, but boy, was I wrong. IO-limitations don't even compare with the CPU limitations.

Getting 5% or more of a fast CPU doesn't even compare to getting 2.5% of an even slower resource like that hard drive if you are compiling.

Further experiments



During the time I was considering the CPU to be my future bottleneck, I was thinking which option would be better: 2 x 256MB slices or a bigger 512MB slice.

According to their rules and offering, the two configurations are totally comparable. Even more, using their sharing rule, 2 x 256MB slices should get at least the same CPU cycles under load as the 512MB one. (Further emails from Slicehost's support led me to believe the rule might be oversimplified, but they didn't tell me in what way -- I think the weight of the smallest slices might be even smaller with the bigger slices getting even more of their share).

So, if under load they get the same CPU cycles, it means that when the machine has CPU cycles to spare, I have 2 candidate slices to get those spares.

So the question was: for a 5% price increase I would pay for 2 x 256 slices compared to 1 x 512 slice, will I get at least 5% more CPU cycles ?

I'm still not certain with the data I've computed that it might happen. Also, the new question now would be: will I get at least 5% more IO operations ?


Non-agression



The above post isn't a rant against Slicehost. I think they are providing a decent service for their price. It is interesting though to see which kind of usage can one put on a VPS and which are better to be run on the server in the basement.


512MB update



Well, isn't this interesting. A vertical upgrade to 512MB of RAM is another world entirely. Maybe the new VPS is on a less-loaded machine, but at first sight, it's looking way better: the previous 28 hours build (fresh) takes now only 40 minutes for a small update. I'll try a clean build later this week and see how fast it is.

So it seems it wasn't only a problem of slow IO, it was also a big problem of not enough RAM leading to swap file trashing.

Friday, November 07, 2008

I guess it has begun: the environment is at fault for everything

I'm always amazed at the amount of bullshit people are able to come up with, especially when explaining some corporate move.

Take for example my main bank BRD - Groupe Société Générale. Yes, it's the same Groupe Société Générale which showed at the end of 2007 a € 4.9 billion fraud. But it's OK since the Romanian branch is really profitable for them due to limited consumer education here and powerless consumer protection institutions.

I just noticed a new message from them on the Internet Banking site: due to increased environmental awareness from the Bank, they are encouraging people to get alternative bank account statements via online banking or by post. Otherwise you are entitled to one printed account statement per month from their offices.

The reason is, of course, to save the trees by printing less. Of course they are willing to print tons of the stuff if you are willing to pay -- which will go directly into their profit but that's another problem, no ?

Also add here that they also increased the tax for having an account by 20% for individuals and 50% for companies. That probably also had some environmental reasoning that's escaping me.

Anyhow, I'm looking forward to more price increases and consumer ripoffs that's going to be done in the name of the trees.

Too bad us people can't buy our own carbon credit so that companies won't be able to offset that extra cost in the name of the environment on us. But you know what ? I'm pretty sure some one will introduce carbon credit for the masses. After all, why not ? It's a nice way to bring some more money to the state budget.

And only then that old saying will come true: they'll tax you for the air you breath !

Well, technically for the air you exhale but we're close enough.

Thursday, August 07, 2008

No new mail! Want to read updates from your favorite sites?

For a while now I've started using GMail's "Archive" button aggressively on my inbox. The end result has been that from thousands of emails, I now have 0 (zero) ! Everything is archived.

When I get a new email, it sits in the Inbox until it is resolved (ie. I reply or read it). Then it's instantly archived. Out of sight, out of mind.

I've found that this technique greatly reduces the information overload coming from emails. With a full inbox that was also showing snippets of the message (ie. small previews), every time I looked at my inbox I had some information to process. Like: oh, look, that one is starred, I wonder when they'll reply or hm, it's been quite some time since I've got an email from X as the name is on the bottom on the inbox, etc. etc.

Basically a full inbox sends you some information even when no unread emails exist. It's also quite a bad way to "search" for email. I used to manually look for some subject and/or sender in order to hit reply. Now I just use GMail's search.

I remember about some TED video where the host said something like our brain likes new information, we have an addiction for new stuff. Which is exactly what email feeds. It feeds our addiction for new things, even by just having a full list of previously received emails. I also assume that's why sites like Slashdot, Digg and Reddit are quite popular: they feed us new, easy to process, information. Imagine brain junk-food if you will or the Internet-equivalent of too much TV will rot your brain.

Related to this need to always get new stuff, I find it interesting the way Google handles this. When your inbox is empty, you get this message: No new mail! Want to read updates from your favorite sites? Try Google Reader (with a link to google reader).

So what Google is doing here is proving us what we have become used to. Not enough interruptions, not enough new stuff from email? Why gee, why don't you try this other source of new things: Google Reader. Come on, get a quick fix !

Thursday, July 17, 2008

Oh, my, how the NetBeans community has grown !

For quite some time now I've noticed an interesting trend: I don't have the time to read the email in the NetBeans mailing lists. A lot of emails where I could have given some help just fly by me as they are just too many.

Just now openide@ has 2000 unread messages, the oldest unread being from 26 November 2006 about the Manifest File Syntax tutorial (boy, a lot have changed in the Editor APIs). nbdev@ also has about 1700 unread but that's ok as I rarely post / answer there.

Now, this trend seems to be caused by two reasons: me being busy (and lately I'm working full-time on getting the Editor APIs usable in a standalone way) and the community growing.

I do remember the time when I had zero! unread messages. Now I hardly notice when another hundred adds-up.

So, how do you guys handle the workload ?

Of course, the solution might be to be a little more methodical about it and dedicate some exact time (like 30 minutes / day) but it just doesn't seem to work with me. Must be the 100 Editor modules I have open right now in the IDE -- sigh...