Tuesday 19 July 2016
New presentation - "Building Stuff for Fun and Profit - Confessions from a Life in Cables and Code"
I recently spoke at BuildStuff Odessa, and had a great time. It's the only conference where I've seen developers on sun loungers, by the pool, working away at their laptop.
I've tried a new experiment with my slides, and are publishing them with speaker notes, so that they make more sense on Slideshare:
Unfortunately, this makes the actual slide tiny. I experimented with Slideshare's built-in speaker note view, and that hid the notes in a big list. Older versions of Keynote apparently have a more compact speaker note export, but unless that gets re-introduced, it seems I'm stuck with the rather inelegant format.
Monday 4 July 2016
Six Myths and Paradoxes of Garbage Collection - MSc Thesis
I'm not sure why I didn't do this until now, but I've made my MSc dissertation, "Six Myths and Paradoxes of Garbage Collection," available online. It was written in 2007, so in tech terms, it can be considered quaintly antique - back when it was written, there was still a debate about whether garbage collection was a good idea. StackOverflow hadn't been invented yet, and C++ was the language to beat.
What's the dissertation about? Well, this is the abstract:
Although some of the arguments it makes are no longer necessary, there's a lot I still like about the paper. It's got equations, it's got queueing theory, it's got dissections of usenet trolling about GC, it's got diagrams, and - the best bit - photos of garbage bins. You can read the full 84 pages here; it's licensed under Creative Commons Non-Commercial Share Alike.
What's the dissertation about? Well, this is the abstract:
Many myths and paradoxes surround garbage collection. The first myth is that garbage collection is
only suitable for the incompetent, unskilled, or lazy. In fact garbage collection offers many architec-
tural and software engineering advantages, even to the skilled developer. The second myth is that
garbage collection is all about about collecting garbage. Garbage collectors also include an allocation
component, which, along with their powers of object rearrangement, can make a significant difference
to application performance. Thirdly, criticisms of garbage collection often focus on the pause times,
and responses to these criticisms often focus exclusively on reducing pause times, in the mistaken belief
that small pause times guarantee good application response times. Pause times are also often used as
a metric of general application performance, and an increase in pause times is taken as an indicator
of worsened performance, when in fact the opposite the opposite is often true. Paradoxically, even
the total amount of time spent paused for garbage collection is not a good predictor of the impact of
garbage collection on application performance. Finally, the sixth myth is that garbage collection has
a disastrous performance impact. While garbage collection can hurt application performance, it can
also help application performance to the point where it exceeds the performance with manual memory
management."
Although some of the arguments it makes are no longer necessary, there's a lot I still like about the paper. It's got equations, it's got queueing theory, it's got dissections of usenet trolling about GC, it's got diagrams, and - the best bit - photos of garbage bins. You can read the full 84 pages here; it's licensed under Creative Commons Non-Commercial Share Alike.
Wednesday 25 February 2015
Interacting with headless computers (or "How to not keep losing your raspberry pi on the network")
I spend a fair amount of time playing with Raspberry Pis and other ARM-based boxes, like pcDuinos and Utilite. (If my boss is reading this, I'm not "playing", I'm "working".) I still find it amazing how small, cheap, and powerful these devices are.
One thing it's taken me a while to work out is how to connect to the computer. It's kind of a prerequisite for doing anything more interesting, like embedding it in a hat or a ball, or just ... using it. It should be so easy, right? A lot of set-up guides start with you plugging the processor to an HDMI monitor and USB keyboard. This isn't so practical if I'm out and about with the computer, because I'm not going to be lugging an HDMI monitor with me. Even at home, it's not foolproof - my new pcDuino doesn't seem able to display a picture over HDMI to any of the (six) monitors, TVs, or projectors I tried. At that point it's not so much "headless" as "decapitated".
Connecting by ssh requires less bulky hardware, but it has its own challenges: what on earth is the ip address of this thing? I used to ensure I always knew the ip address by nailing it down. I'd connect the pi to my laptop with an ethernet cable (no need for a fancy crossover one with a modern computer), configure the pi with a static ip address (for example, 192.168.2.10), and then configure my laptop's ethernet interface to think it was 192.168.2.1. This ensured I could easily connect to my pi, but it did require some initial set up of the network. For my totally inaccessible pcDuino, this is no good. Even for the pi, it's not ideal - the static ip address means then even if I plug it into my router, it's totally divorced from the "real" internet. More seriously, I've had a few panics where I've loaned the pi out and then found I was unable to connect to it when I got it back. (Curse you, nameless colleague who re-configured the pi to use dhcp and didn't tell me!) The static ip settings also left my laptop in a pretty messed up state.
I've now found a new system which avoids some of these disadvantages. I don't touch the ethernet configuration of the arm box, so it can stay using dhcp. I connect it to my laptop with an ethernet cable but I configure my mac to share its internet connection from wi-fi to ethernet. This OS magic enables the two computers to see each other without either having to have crazy network settings.
I then use nmap to find the pi on the network, as follows:
To work out which ip address to scan, I run ifconfig and use the ethernet access of the bridge100 network interface (but change the last digit to 0). The sudo isn't necessary, but it makes nmap show the MAC addresses. This isn't so useful for the pcDuino, which doesn't have a fixed MAC address, but it really helps identify the pi, which will always have a MAC address in the b8:27:eb range.
Edit - other options: I haven't tried these, but twitter has spoken! If you have a flock (or even a small handful) of identical devices, including ipspeak in the bootable image bypasses the need for any mmap sleuthing. You can plug headphones into the device and it will read out its ip address every thirty seconds. Alternatively, if you have individually configured devices, avahi allows them to be addressed using a memorable name.
Acknowledgements: There are many similar guides on the internet. I've posted this here mostly in the hope it will be useful notes to my future self, since I've independently rediscovered my optimum flow about four times. I figured out the nmap commands from a relatively recent post at stackoverflow, and the idea of using network sharing to connect my pi to my laptop without wrecking my laptop's settings is from alvin_jin.
One thing it's taken me a while to work out is how to connect to the computer. It's kind of a prerequisite for doing anything more interesting, like embedding it in a hat or a ball, or just ... using it. It should be so easy, right? A lot of set-up guides start with you plugging the processor to an HDMI monitor and USB keyboard. This isn't so practical if I'm out and about with the computer, because I'm not going to be lugging an HDMI monitor with me. Even at home, it's not foolproof - my new pcDuino doesn't seem able to display a picture over HDMI to any of the (six) monitors, TVs, or projectors I tried. At that point it's not so much "headless" as "decapitated".
Connecting by ssh requires less bulky hardware, but it has its own challenges: what on earth is the ip address of this thing? I used to ensure I always knew the ip address by nailing it down. I'd connect the pi to my laptop with an ethernet cable (no need for a fancy crossover one with a modern computer), configure the pi with a static ip address (for example, 192.168.2.10), and then configure my laptop's ethernet interface to think it was 192.168.2.1. This ensured I could easily connect to my pi, but it did require some initial set up of the network. For my totally inaccessible pcDuino, this is no good. Even for the pi, it's not ideal - the static ip address means then even if I plug it into my router, it's totally divorced from the "real" internet. More seriously, I've had a few panics where I've loaned the pi out and then found I was unable to connect to it when I got it back. (Curse you, nameless colleague who re-configured the pi to use dhcp and didn't tell me!) The static ip settings also left my laptop in a pretty messed up state.
I've now found a new system which avoids some of these disadvantages. I don't touch the ethernet configuration of the arm box, so it can stay using dhcp. I connect it to my laptop with an ethernet cable but I configure my mac to share its internet connection from wi-fi to ethernet. This OS magic enables the two computers to see each other without either having to have crazy network settings.
I then use nmap to find the pi on the network, as follows:
sudo nmap -sn 192.168.1.0/24
To work out which ip address to scan, I run ifconfig and use the ethernet access of the bridge100 network interface (but change the last digit to 0). The sudo isn't necessary, but it makes nmap show the MAC addresses. This isn't so useful for the pcDuino, which doesn't have a fixed MAC address, but it really helps identify the pi, which will always have a MAC address in the b8:27:eb range.
Edit - other options: I haven't tried these, but twitter has spoken! If you have a flock (or even a small handful) of identical devices, including ipspeak in the bootable image bypasses the need for any mmap sleuthing. You can plug headphones into the device and it will read out its ip address every thirty seconds. Alternatively, if you have individually configured devices, avahi allows them to be addressed using a memorable name.
Acknowledgements: There are many similar guides on the internet. I've posted this here mostly in the hope it will be useful notes to my future self, since I've independently rediscovered my optimum flow about four times. I figured out the nmap commands from a relatively recent post at stackoverflow, and the idea of using network sharing to connect my pi to my laptop without wrecking my laptop's settings is from alvin_jin.
Thursday 27 September 2012
Java basics: converting a collection to a string
If you look up how to write a map or a list out as a string, you can find lots of complicated answers involving loops and even XMLEncoders. It doesn't have to be that hard. The easiest way to convert a Map into a nice string?
It's even easier for a list:
String niceString = Arrays.toString(list.toArray());
Ta-daaa!
String niceString = Arrays.toString(map.entrySet().toArray());
Ta-daaa!
Wednesday 26 September 2012
Enterprise OSGi in Action
We're getting really close now ...
Are you looking for a way to write an OSGi web application? Or get JPA persistence working in OSGi? Or just group your OSGi bundles into something a bit more granular? Do you need a guide to best practices for OSGi build and test?
Modern enterprise applications must be scalable, maintainable, and modular. Unfortunately, by itself Java EE doesn't do modularity very well. The Enterprise OSGi model enforces simple rules to make Java better at modularity. And now, projects such as Apache Aries and Geronimo provide pluggable components that make it easier than ever to use OSGi's mature modularity system in your own enterprise applications.
Enterprise OSGi in Action is a hands-on guide for developers using OSGi to build the next generation of enterprise Java applications. By presenting relevant examples and case studies, it guides the reader through the maze of new standards and projects. Armed with this knowledge, readers will learn how to build and deploy enterprise OSGi applications, use automatic dependency provisioning, declaratively provide enterprise qualities of service to their business logic, and make use of the Java EE technologies they know and love in a familiar way.
Manning have a reader forum where you can ask Tim and me questions (or I'm on twitter @holly_cummins).
Monday 15 June 2009
Health Center 1.0 released
Last week was a big week for the Health Center team. Version 1.0 of the Health Center was released. Full installation instructions are available from the Health Center homepage. It now works with IBM Support Assistant and is suitable for use in production (with recent IBM JVMs). A method profiler that's got a low enough overhead to be used in production and left on all the time is a pretty special thing, I think. One which is only part of a tool which also assesses system stability, triages a range of performance problems, and makes performance tuning accessible to non-experts is even more special, I think.
To celebrate the whole team baked and brought in carrot cake, brownies, chocolate chip cookies, lemon cake, rocky road squares, and flapjacks. We even had balloons. It was very nice but I've never eaten so much at work and I think we were all ready to explode by the afternoon - in future I think we should delegate and have only half the team bake per release.
To celebrate the whole team baked and brought in carrot cake, brownies, chocolate chip cookies, lemon cake, rocky road squares, and flapjacks. We even had balloons. It was very nice but I've never eaten so much at work and I think we were all ready to explode by the afternoon - in future I think we should delegate and have only half the team bake per release.
Tuesday 12 May 2009
How to interpret a method profile
In a previous post, I described the general methodology I use to diagnose performance problems.
Once an application has been identified as CPU-bound, either by using the Health Center or CPU monitoring, the next step is to figure out what is eating CPU. In a Java application, this will usually be Java code, but it could be native code. Profiling native code usually requires platform specific tools; on linux, I use tprof. Profiling Java code is a lot easier, and is more likely to yield big performance improvements, so I usually start with a Java profile and only profile native code if I didn't get anywhere with the Java profile. For Java profiling, I use the Health Center. It's got a few advantages, one of which is that there's no bytecode instrumentation needed, there's no need to specify only a few packages to profile, and the overhead is very low, so it won't affect the performance characteristics of what you're trying to profile.
So what does a method profile tell you? Simply put, it tells you what your application is spending its time doing. More precisely, it tells you what code your application is spending its time running - it doesn't tell you when your application is waiting on a lock instead of running your code, and it doesn't tell you when the JVM is collecting garbage instead of running your code. Assuming locking and GC aren't the cause of the performance problem (see triaging a performance problem), the method profile will give you the information you need to make your application go faster.
The application is doing too much work, and that's slowing it down. Your aim in performance tuning is to make the application do less work. There are lots of ways to make code more efficient. Sometimes people start performance tuning by code inspection - they read through the code base looking for obvious inefficiencies. I've done this myself lots of times, but it's not a particularly efficient technique. Say I find a method which is pretty carelessly implemented, and I double its speed with a bit of refactoring. Then I triumphantly re-run my application, only to discover nothing's changed. What's going on? The problem is that a big performance improvement on a method which is rarely called isn't going to change much of anything. For example, if I double the speed of a method which uses 0.5% of my CPU time, I've sped my application up by an imperceptible 0.25%. If, on the other hand, I shave 10% of the time of a method which is using 20% of my CPU, my application will go 2% faster. So the first rule of performance tuning is to optimise the methods at the top of the profile and ignore the ones near the bottom.
This is example from a method profiler, in this case the one in the Health Center. One method is clearly using more CPU than the rest, and so it's coloured red. In this case, 60% of the time the JVM checked what the application was doing, it was executing the FireworkParticle.animate() method. This is what's shown by the left-hand 'Self' column. The 'Tree' column on the right shows how much time the application spent in both the animate() method and its descendants. Some profilers call this column 'descendants' instead. Usually the Self figures are more useful for optimising an application.
What makes a method appear near the top of a method profile? It's taking up a lot of CPU time, but why? There are two reasons; either the method is being called too often, or the method is doing too much work when it's called. Sometimes it happens that a method really is doing the right amount of work the right number of times, but this is usually only the case after a fair amount of work. In their natural state, most programs can - and should - contain inefficiencies. (Remember that premature optimisation is the root of all evil.)
Some profilers can distinguish between a method which is called several times, and one which is called once and then spends a long time executing, but many cannot. The reason is that some profilers operate by tracing - that is, recording every entry and exit of a method. This gives very precise information, but usually carries a fairly heavy performance cost. The IBM JVM can be configured with launch parameters to count or time method executions, but it's only advisable to do this for a restricted subset of methods. An alternate method of collecting profiling information is to sample - that is, check periodically what method is executing. This is much less expensive but doesn't give as much detail as tracing profilers. The Health Center uses method sampling already built into the JVM to allow profiling with extremely low overhead.
Often it will be obvious when inspecting a hot method if it's being called frequently or is slow to run. Code with loops, particularly nested loops, is probably expensive to run. Code which doesn't seem to do much but which is at the top of a profile is probably being called a lot. This leads neatly to the next steps in optimization: eliminate loops and do less work inside loops for expensive methods, and call inexpensive method less frequently.
How do you go about making sure a method is called less? Method profilers which also record stack traces can make calling method less pretty easy. For example, this is the output of the Health Center, showing where calls to one of the top methods in the profile have come from:
In this case, 98% of the time the doSomeWork() method was sampled, it was animate() that called it. 2% of the time, it was draw() that called it. In this case, the next step is to inspect the animate() method and see why it's calling doSomeWork(). Often, at least in the first passes of optimisation, most of the calls to the top method are totally unnecessary and can be trivially eliminated.
Once an application has been identified as CPU-bound, either by using the Health Center or CPU monitoring, the next step is to figure out what is eating CPU. In a Java application, this will usually be Java code, but it could be native code. Profiling native code usually requires platform specific tools; on linux, I use tprof. Profiling Java code is a lot easier, and is more likely to yield big performance improvements, so I usually start with a Java profile and only profile native code if I didn't get anywhere with the Java profile. For Java profiling, I use the Health Center. It's got a few advantages, one of which is that there's no bytecode instrumentation needed, there's no need to specify only a few packages to profile, and the overhead is very low, so it won't affect the performance characteristics of what you're trying to profile.
So what does a method profile tell you? Simply put, it tells you what your application is spending its time doing. More precisely, it tells you what code your application is spending its time running - it doesn't tell you when your application is waiting on a lock instead of running your code, and it doesn't tell you when the JVM is collecting garbage instead of running your code. Assuming locking and GC aren't the cause of the performance problem (see triaging a performance problem), the method profile will give you the information you need to make your application go faster.
The application is doing too much work, and that's slowing it down. Your aim in performance tuning is to make the application do less work. There are lots of ways to make code more efficient. Sometimes people start performance tuning by code inspection - they read through the code base looking for obvious inefficiencies. I've done this myself lots of times, but it's not a particularly efficient technique. Say I find a method which is pretty carelessly implemented, and I double its speed with a bit of refactoring. Then I triumphantly re-run my application, only to discover nothing's changed. What's going on? The problem is that a big performance improvement on a method which is rarely called isn't going to change much of anything. For example, if I double the speed of a method which uses 0.5% of my CPU time, I've sped my application up by an imperceptible 0.25%. If, on the other hand, I shave 10% of the time of a method which is using 20% of my CPU, my application will go 2% faster. So the first rule of performance tuning is to optimise the methods at the top of the profile and ignore the ones near the bottom.
This is example from a method profiler, in this case the one in the Health Center. One method is clearly using more CPU than the rest, and so it's coloured red. In this case, 60% of the time the JVM checked what the application was doing, it was executing the FireworkParticle.animate() method. This is what's shown by the left-hand 'Self' column. The 'Tree' column on the right shows how much time the application spent in both the animate() method and its descendants. Some profilers call this column 'descendants' instead. Usually the Self figures are more useful for optimising an application.
What makes a method appear near the top of a method profile? It's taking up a lot of CPU time, but why? There are two reasons; either the method is being called too often, or the method is doing too much work when it's called. Sometimes it happens that a method really is doing the right amount of work the right number of times, but this is usually only the case after a fair amount of work. In their natural state, most programs can - and should - contain inefficiencies. (Remember that premature optimisation is the root of all evil.)
Some profilers can distinguish between a method which is called several times, and one which is called once and then spends a long time executing, but many cannot. The reason is that some profilers operate by tracing - that is, recording every entry and exit of a method. This gives very precise information, but usually carries a fairly heavy performance cost. The IBM JVM can be configured with launch parameters to count or time method executions, but it's only advisable to do this for a restricted subset of methods. An alternate method of collecting profiling information is to sample - that is, check periodically what method is executing. This is much less expensive but doesn't give as much detail as tracing profilers. The Health Center uses method sampling already built into the JVM to allow profiling with extremely low overhead.
Often it will be obvious when inspecting a hot method if it's being called frequently or is slow to run. Code with loops, particularly nested loops, is probably expensive to run. Code which doesn't seem to do much but which is at the top of a profile is probably being called a lot. This leads neatly to the next steps in optimization: eliminate loops and do less work inside loops for expensive methods, and call inexpensive method less frequently.
How do you go about making sure a method is called less? Method profilers which also record stack traces can make calling method less pretty easy. For example, this is the output of the Health Center, showing where calls to one of the top methods in the profile have come from:
In this case, 98% of the time the doSomeWork() method was sampled, it was animate() that called it. 2% of the time, it was draw() that called it. In this case, the next step is to inspect the animate() method and see why it's calling doSomeWork(). Often, at least in the first passes of optimisation, most of the calls to the top method are totally unnecessary and can be trivially eliminated.
Subscribe to:
Posts (Atom)