Our company is involved with many Open Source projects, but $OurBigApp ain't one of 'em, in part because we have competition, and you'd have a hard time finding bigger champions of closed source than them. Also because $OurBigApp used to be a different company which was bought by a bigger company which is the one involved in Open Source. And also because, well, if we gave away how it all worked, a lot of us would have to look for real jobs.
Seriously, there is value or no one would buy our software. And yet, we may go Open Source within a few years and move from a sales to a service model. Or maybe not. Time will tell. It doesn't really matter to me; I'll have a job either way.
Meanwhile, the secrecy results in some interesting tickets, such as a case I'm still working on. It also makes such tickets frustrating as hell.
Our software has its own file system and we even explain a lot of the inner workings. Unfortunately, someone decided we needed to keep mum on a few bits of it, and one of those bits is the method used to distribute files into subdirectories. Why for the love of all that does not suck does someone actually think that's something we need to keep under wraps? There's nothing terribly innovative about it. More to the point, opening it up completely would work to our benefit, allowing customers to rework the system to best meet their needs.
It seems that whichever jackass made the decision did so thinking, "We need to have secrets and that's one of them. They'll live without the knowledge."
Except we have customers storing hundreds of millions of files. Billions in some cases. And that's where we have the problem, and it's not limited to Windows.
Boring but brief explanation:
Windows can't store more than four billion files in any directory. Once you reach about half that, the system gets really sluggish -- so much so that requests will time out while the system is still looking for the file. It's worse in UNIX for very different reasons, getting very slow as the file count reaches only around four million despite the ability of UNIX to handle many more files overall. The reason is that at 4M, you need yet another inode: searching for a file requires slogging through three multiple indirect blocks.
I got a ticket a few weeks ago asking about some problems with the file system and our utility for setting up subdirectories. I answered quickly, sending the utility, and everything was running smoothly again. Then the guy asked about how the subdirectory distribution system worked.
"Well, I'm sorry to say that it's proprietary. But maybe I can get some additional information anyway. After all, this is an older version of our application and this particular method is no longer even used in our software."
The customer was floored and sent off a gushing note to my manager about people going the extra mile. Yay me.
I actually managed to get to the right person who would be able to tell me about it. Except he couldn't. What he told me I already knew and he couldn't get authorisation to tell me more. I tried again to see the source code just to figure out what the hell the distributor was doing and how it calculated doing it. No dice. And so I have to send the guy an apology that I couldn't do more for him and that he'll be stuck running a few scripts to constantly catalogue and track the files as they're written to the file system.
However, most of the time we get requests for information on the inner workings, it's shit like this:
Hi, we want to reverse engineer $YourBigApp. How do we do it?Fuckwits.