Jeff Mesnil is a developer who had to play ‘support’ for a bit. Specifically, he had to read logs produced by others. He immediately discovers a lot of lessons that are not obvious when you only read your own code.
To his list I want to add a couple of things from my own support experience:
- Make sure you have some way to identify which log entries belong together. Use TransactionId if you have one, thread name if you don’t. Weblogic logs put thread name for each message. If this is expensive, use thread ID number instead and correlate them later with Thread dumps.
- Always put timestamps in. Again, if this is expensive, put System.currentTimeMillis() instead of (new Date()).toString() and put a log entry at the start (or periodically) that shows both (to clarify timezones and allow correlation to other dated logs)
- If you have a long operation, try to put a log message at the start and end of it. That way, when you later want to discover what your real life processing time is, you can parse those entries. Again, this requires ID and timestamps. One of the frequent problem we have is customers running into default timeout values being too small, but not knowing what the real value should be for them either.
- Remember that logging is expensive and (at least for log4j) is gated at the write method. So, if you have 100 threads all logging away into the same file, at some point they will all bottleneck on a write method. And if the write method needs to rollover the file or build a complex log string, they will all just wait. I have support case scars to prove this point.
- Try to grep your own logs instead of eyeballing them. If you have difficulty getting a good set of log entries, the person supporting your application will have even more problems.
BlogicBlogger Over and Out