Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Regularly weaseling out of tasks like this wastes the time of multiple people and either ends up back with the original dev or gets dumped on a more responsible worker.

A lot of this depends on the environment and circumstances. If you're in the middle of working on feature X it's very annoying to have to drop it to look at a bug, sometimes it's necessary but usually it can wait. This is where having great user support comes in too, capturing what the user was doing, getting relevant logs and knowing how to reproduce are important and if you don't have a good support team that falls to the developers.

The other big factor is external pressures, if you have management asking for frequent updates and putting pressure on to get through tickets quickly (especially common at consultancy type shops) then bug fixing is miserable high pressure work that I will avoid at all costs. An environment without those and bug fixing can be fun, give me the biggest most spaghetti like enterprise system and no time pressure and it feels like getting paid to solve a giant Sudoku puzzles all day.

While we're all swapping war stories I'll share my most epic 2 line fix at a place were I was afforded the time. I was working on this huge mess of over abstracted, multi threaded, spaghetti enterprise OO, trying to track down a bug that happened maybe once a fortnight. I tried narrowing it down to reproduce but nothing was working, the stack trace was about 15 levels of indirection away from the trigger so the most I could narrow it down to was hundreds of thousands of lines of code. After a couple of weeks of trying things and getting nowhere I told the boss we'd probably never track this down but they insisted I keep trying. Eventually I wrote a script to copy all the logs locally where I could search and do some analysis on them, after grepping 18 months of logs I noticed that on 3 or 4 occasions the same error was happening within 5 seconds of each other. From there it was a matter of finding "Sleep(5000)" in the code to know exactly where the error was. Turns out that 15 levels of indirection was quite slow and getting a stale piece of data we already had anyway so the time wasted turned into a nice little improvement.

The scripts for the logs become invaluable too. So many times we got "you incompetent idiots broke x with your last update" we could reply a minute later with "x has been happening since <time before any of us worked there>, you only just noticed".



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: