Call me a fanboy if you must, but the people at Coding the Architecture know their stuff. The slides Simon Brown put together on Why Software Projects Fail are a very good, very specific description of what a good software architect can bring to the a project
Test for good developers
I’ve definitely made some interviewing mistakes in my day. Once mistake which I hope too avoid in the future is hiring people who don’t think architecturally, as so eloquently described by Simon Brown over at Coding the Architecture.
I’ve recommended a couple of people who seemed quite competent when talking about their technical skills during the interview, but who started to glaze over when I started introducing them to the organization of the project on their first day on the job. I got a horrible sinking feeling in the pit of my stomach when this happened, as I realize how much extra work I’ve made for the team.
These are programmers whose basic skills coding skills were good, and ever had a good intellectual knowlege of “architecture isseus”. (They understood, for example, why stateless or statefull session EJBS were appropriate for different situations). But when presented with a system which used a command pattern for it’s primary interface they were baffled. Instead of adding a new type of command object, and using the exiting command APIs, they put database calls directly in the JSP code. Large static data structures which had been initialized once and kept in memory were being initialized with every request after these developers “reworked” that code. It took weeks of coaching in each case to get them on track.
So in the interview, I think it’s a good idea to always walk through the architecture of a hypothetical project, and ask questions about things like lifecycle of objects, resource bottlenecks, existing usage patterns, and scalign strategies. Although it’s more work when preparing for the interview, it’s absolutely less work than trying to work with someone like that on your team.
JavaServer Faces performance concerns addressed
I’m back in JSF land recently, and today I found Eelco Klaver’s slides on JSF performance. Anyone considering JSF should read the slides. They provide a clear and succinct analysis of why there is a very serious out-of-the-box performance issue with JSF, and what to do about it. This issue is not widely documented, so I thought I’d add one more link a really good description and solution.
The problem is this: out of the box JSF keeps way too much data around in the session. On the last project where I used JSF, before my colleagues and I realized this, the 10 – 50 Mb sessions that JSF creates were really clobbering our server. And although Eelco Klaver recommends setting com.sun.faces.numberOfViewsInSession to a value 5, we found things worked just fine with a value of 1. Our site was very light on forms – your milage may vary.
Finally, just for good measure, here is some documentation on all of these JSF options.
Another excellent point
How does one arrive at good design, whether it be the architecture for a huge system or the layout for a single interface? Nick Malik over at Microsoft sums it up very nicely:
Design is a process where you consider the problem and then propose multiple, competing, wildly creative solutions. You then narrow down your brainstorm and weed out the bits that won’t work… and then you propose multiple, competing, wildly creative refinements… and the cycle continues. This happens a couple of times, until you have refined your solution by being creative, then logical, then creative, then… you get the picture.
Another post definitely worth reading.
Doing it right at Facebook
I came across this today, and was so pleased by the clarity of thought that I needed to share it. From Robert Johnson at Facebook, describing their monitoring protocol, Scribe:
The second major design decision was about reliability. We chose was a middle ground here, reliable enough that we can expect to get all of the data almost all of the time, but not reliable enough to require heavyweight protocols and disk usage. More specifically, Scribe spools data to disk on any node to handle intermittent connectivity node failure, but it doesn’t sync a log file for every message, so there’s a possibility of a small amount of data loss in the event of a crash or catastrophic hardware failure. Basically, this is more reliability than you get with most logging systems, but not something you should use for database transactions. As it turned out, this is a reasonable level of reliability for a lot of use cases, and has made scaling much easier. It’s also the source of a lot of the hard-learned lessons: getting the system to catch up seamlessly after a significant network problem is tricky, especially when there are tens or hundreds of gigabytes of data backed up.
This is a perfect example of skillful architecture – picking appropriate design for the problem at hand, without getting seduced into a seemingly “perfect” solution plucked from some textbook.
The rest of his post is absolutely worth a read.